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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
10 is any of the four bases. Inthe amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 

1 5 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1 -984, 1 969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954. The 
sequence information can be a segment of any one of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO:l-984, 

25 1969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment The collection 
can also be provided in a computer-readable format 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 

3 5 reverse or direct complements) according to the invention have numerous applications in a variety 
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invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

10 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 

t 

exemplified by Vollrath et aL, Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are usefiol for a variety of applications, 
as described herein, including use in arrays for detection. 
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4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
3 5 strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
5 cells (PGCsr refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
10 comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORE or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
ongm which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
mvention may be assembled from fragments of the genome and short oUgonucleotide linkers or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion " or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 1 00 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30. 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1 -984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is fully matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is fully matched in the expressed 
sequences is also approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a full match (1^4 25 ) times the 
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increasedpro^ ^ 

^ghteenmer^thasmglermsmatchcantedetectedman^ 

approx^atelyonemfive.Theprobabaitythatatwenty-me^ 
detected in a human genome is approximately one in five. 

5 Theterm-openreadingfxame," ORF, means a series of nucleotide triplets coding for 

ammo adds without any termination codons and is a sequence translatable into protein 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
ac ld sequences. For example, a promoter is operably associated or operably linked with a coding 
sequenceifthe promoter controls the transcription of the coding sequence. While operably 

elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcnption/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
^-^centy^^^ 

15 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecule, Apolypeptide "fragment/, "portion;- or "segments a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 1 50 amino acids and most preferably less than 
lOOaminoacids. Preferably me peptide is from about 5 to about 200 amino acids. Tobeactive 
any polypeptide must have sufficient length to display biological and/or immunological activity ' 
The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
frompost-translational modifications of the polypeptide*^ 
carboxylation, glycosylate phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
orproteinwithoutasignal or leader sequence. The "mature protein portion" means tbat portion 
ofthe protein which does not include a signal or leader sequence. Thepeptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may notinclude the initid metmonme residue. Tbe metWonine residue 
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may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant r, (or "analog* 5 ) refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptide, Such alterations 
can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example 
10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
15 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water 
buffers, and other small molecules, especially molecules having a molecular weight of less than ' 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, me nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g. , microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
30 unaccompanied by associated native glycosylate. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylate modifications; polypeptides or 
protems expressed in yeast will have a glycosylate pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to aplasmid or phage or virus 
or vector, for expressing apolypeptide from aDNA (RNA) sequence. An expression vehicle can 
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comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

10 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaiyotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g. , soluble proteins) or partially (e.g. , receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound 0NA in 0.5 M NaHP0 4 , 7% sodtan, dodecyl sulfite (SDS), 1 * EDTA a. 
65-C, tmd washing in 0.1X SSC/0.1% SDS a, 68'C), and moderately stiingen, conditions (i e 

wa S hing m 0^SSC/0. I %SDSa t 42-C).O t he r exe n p,a Jy h y b tidiz an„ncondi t ion S are ' " 
described herein in the examples. 

5 In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybnchzaton conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer bod, ,„ nucleotide and amino acid 

TT T*' 3 ^ VarfeS *" 3 refere °« — * oneormore 

substitutions, detetions, or additions, the net effect of which doea not result in an adveree 

functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more man about 
35/. (,.*., me number of individual residue substitutions, additions, and/or deletiona in a 
subtly equivalent sequence, as compared to the corresponding reference sequence, divided 
by the to*! number of residues in the substantially equivalent sequence is about 0 35 or less) 
Such a sequence is said to have 65% sequence idemity to the listed sequence In one 
embodtment, a subsfanualiy equivalent, ,g. , mutant, sequence of the invention vasies from a 
hated sequence by no more than 30% (70% sequence identity); in a variation of tins embodiment 

•0 ^---ta25%(75%seq„ence i denti W; andma ra ri i ervariationof to embodimen. by' 
no more man 20% (80% science identity, and in a former variation of mis embodiment by 'no 
more than ! 0% (90% sequence identity) and in a ftrther.variation of tins embodiment by no 
--^5%(95% s ^ i d en «y ) . Su ^ My ^^ £sm ^ iinitaadd 

sequences according to the invention preferably have a, least 80% sequence identic with a listed 
ammo actd sequence, more preferab.y a, Ieast ss% „ ^ m ^ ^ ^ 
90/„ sconce identic mote preferebly at leas, 95% sequence identity, more preferebly at leas, 
98 A sequenoe identic and most preferably at leas, 98% ideally. Substantially equivalent 
nucleotide sequences of me invention can have lower pereen, s^uence identities, taking in«o 
account for example, me redundancy or degeneracy of me genetic code. Preferably, nocleotide 
sequence has a. lea* about 65% identity, more preferebly a, leas, about 75% identic more 
preferebly a, leas, about 80% identity, more preferebly a, leas, about 85% identity more 
preferebly at leas, about 90% identify, and moat preferably a, least about 95% identity more 
preferab,ya,leas,98% and most preferebly a, teas, about 99% identity. Fortheputposesofthe 
present mvention, sequencos having subatentially equivalent biological activity and substantially 
equtvafent expression characteristics are considered substentiaHy equivafen,. For meptaposes of 
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determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term 'totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
1 0 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
1 5 using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ED NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucletade which b an allelic variant of any polynucleotide recited above- (d) a 

PC MM tha, encodes a polyp^de compIjsillg , ^ ^ „ 
polypeptides of SEQ ID NO:985-„68, 2953-3936, 3943-3948 or 3955-3960. Domains of 

^desinc^Hgand-bh^e^^,^^ or cytopfcsmic domains or 
rn^oglobulin-nke domains; domains in enzyme-hke polypnea include catalytic and 
. 0 ^ d ° mataS " ^ ^ ""■**«>*« 

Tie poly.xndeottdes of the invenrion include natural* occurring or wholly or parttdly 
synd.ee DNA , e . g „ cDNA and genomic DNA> md ^ , g ., ^ ^ J^J* 

region of the cDNA. S 

15 herein ^^T^^™^^^^^^ 

ZT?7T°**~ i '~* L ^^^^P— ofp.beaorp.tae, 
fc »*^^uenceinfonnauonfori^^ 

•0 b»obtan*usmgme*od sto ,™ i n.he ai t Forage, ft, lengmcDNAorgenotacDNAtha, 

con^pondstoanyofdrepoly^cleo^desofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 3949- 
^-^^byac^a^p^^^^^^^^ 

hybntebonconditionantaganyofftepolynucleobdeaofSEQIDNO: 1-984, !969-2952 3937 
3942or3949-3954orapo rti on«he rc „faaa pro ba. Alternatively, the polynucleotides of SEQ ID 
th^ | ' ^^'"3942 or 3949-3954 may be used as the basis for suitable pitaens) 

libraries. 

"W-«^*^.ta-rt 1 W tol B Bnillll| ,^ 

^^ST.gbpn.fflidUmGene. ^^Tsequent^canprovideidenti^tagsequenceinformation, 
rcpresenti^vefiagmentorsegmen.mforn^on.ornovelaegmeutinfom^^^ 
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Tta polynucleotide, of the invention also provide polynucleotides including nucleotide 

a^t r"~ y ~ toaeP0, ™ Kre ^^ 

accontag to me mvention can have, eg., a, leas, about 65-/, a, leas, about 70%, at !east about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J MoL Evol. 
36 290-300 (1993) and Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a 

3 0 F ASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
protein,; that is, naturaUy-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

; The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by metnods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encodmg the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
ac ld alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
wdl typically be modified in series, by substituting first with conservative choices (e g. 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant ' 
chorees (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 rescues, preferably about 1 to 10 residues, andare typically contiguous. Amino acid 
msertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host ceils and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein 

In apreferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses ohgonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
s:te of bemg changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PGR may also be used to create amino acid sequence variants of the novel nucleic acids 
When small amounts of template DNA are used as starting material, primer(s) that differs 
shghtly m sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant 
5 . A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et at, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et at, supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 

10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vecta, Weapon vector, ptobc general vect0IS , ^ seqUHicing ^ A 
organism orpart of a multicellular organism. 

* avmsauyormeuuc^oeaeoueuceaof SEQrBNO: 1-984, .^52,3937-3^7^ 
3954or a fra^en, mereof or any outer polyttudeoudes of me invention. ,„ one embodhnen^e 
recombmau, consul of the ptesen, inveuuon cotnprise a vector, such as a p.asuud o^ 

2T'rr 3w2or3M9 - 3954Ma ^^ f ^^^fo2 d orir' 

0 or*n«auon. ^-aaeofavectorcompnamgoneofmeORFsofmep.aeu.mveuuou J 
imked to *, ORF. Large number of stutabie vectors and promos mbil>vmt0 ^ J, 

z!rt"j^"^ fc ~ * ~0fTZ 

mvenuon. lie fcll^veo^ are provided by ^ of example. Bacterial: ^p*—^ 

f R54 °-^< P «- pWLneo,pSvip0044 
PXH, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia) 

The isotaed polynucleotide of me invention may be opentbl, linked ,o an exprcaaion 

,9 - W ^('99I), i norder to p ro duce t hepm,ein re c„mbinanriy Many 
-able express™ conuo, scucneca a. known in «he ari. General memods of exprealg 
^mhm m ,p rot e iMaredsolMTOmdTOexemplifledjnRKauftiaa 

^^185,537-566(1990). As denned herein "operably imted" means ma, «he isolafcd 
Mynucleoh e of the invention and an expression confro, seance a, situated winrin a H 

(uansfe«ed) wtm the ligated polynueleotide/exprcasion conuol s^uence 

^"^-beae^tednumanyde^dgenauaingCATfcUommphenico. 
W^v^orodrervecto^maelec.b.nraric^. Two app^ate vectoL 
pKK32-8a„dpCM7. Particmarnamed bacfcria, promote, inctade lacUacZ, 13, T7 « 

«^ E «~^udeCMVmuued i n«ec,,y,„SV«c 
W, eariy and Ute SV40, LTRa fa, ^ and mouse me^omionein-.. S^tion o, 

aea^pnatevectorandprnmo^risweU^thinmelevelofo^inaryaM^^ 
Orally, reeumbinan, expression vectors win mc.nde origiua oftephcarion and saleable 
^erspe^ntugt^o^^^^^ ^ 

and, cerevtrwe TKP1 gen, and. pronto,. derived torn a higUy^^tod^t 



19 



WO 01/57190 PCT/US01/04098 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 

1 5 within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means (e.g. , temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et aL, Nat. Biotech, 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949.3954, or fragments, analogs or 
5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complementary to a "sense" nucleic acid encoding a protein, e.g. , complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided mat comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 
10 strand,ortoonlyaportionthereof. Nucleic acid molecules encoding fragments, homologs, 
derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO- 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3 1 untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e g. SEQ ID 
NO: 1-984, 196*2952, 3937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
regionofamRNA. For example, me antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 
using procedures known in the art. For example,-an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-c^boxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine; Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (z.e., RNA transcribed from the 

1 5 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual |3-units, the 
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strands run parallel to each other (Gaultieref al (19S7) Nucleic Acids Res 15:6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEES Lett 215: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic KNA molecules with ribonuclease activity that are capable of cleaving a 
-single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e. 9 SEQ ED NO: 1- 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 
IVS RNA can be constructed in which the nucleotide sequence of the active site is 
complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Ceohetal U.S. Pat No. 4,987,071; and Cech etalU.S. Pat. No. 5,116,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
apool of RNA molecules. See, e.g., Bartel et al, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. etal (1992) Ann. NY. Acad. Sci. 660:27-36; and 
Matter (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Perry-O'Keefe etal (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 

5 gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 

combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et al (1996), above; Peny-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 

1 5 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 ! -(4-methoxy1rityl)amino-5 r -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5 f PNA segment and a 3' 
DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

25 with a 5' DNA segment and a 3' PNA segment. See, Petersen et al (1975) BioorgMed Chem 
Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger etal, 1989, Proc. Natl Acad. Set U.S.A. 86:6553-6556; 
30 Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/1 0134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Phwm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to Ihe encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO9V09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 



25 



WO 01/57190 PCT/US01/04098 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
aL, in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylate of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions 

10 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

15 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the • 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
10 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
5 the use of one or more selectable marker genes that are contiguous with the targeting DNA 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
) selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

PCT/US92/09627 (WO93/09222) by Selden et al; and International Application No. 

5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 

herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

1 5 the nucleotide sequences set forth in SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ED NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc, 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example 

without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
5 Polynucleotidemasmtablemammaliancellorotherhostcell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed 
10 Protein compositions of the present invention may further comprise an acceptable carrier 

such as a hydrophilic, e.g., pharmaceuticaUy acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
15 nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the.genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
-latedpolypeptidesorproteinsofmepresentinvention. At me simplest level, me amino acid 
20 sequence can be synthesized using commercially available peptide synthesizers The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful for 
25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 

30 cell „ said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
^ticor^karyo^ 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 

culture of host cells of the invention in a suitable culture medium, and purifying the protein from 

the cells or the culture in which the cells are grown. For example, the methods of the invention 

include a process for producing a polypeptide in which a host cell containing a suitable 

5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 

allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 

further purified. Preferred embodiments include those in which the protein produced by such 

process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology, Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 

30 
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The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in die protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
10 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanme-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRTX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 
20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
10 CUltUre conditions stable to express the recombinant protein. The resulting expressed protein 
may men be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
10 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
15 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. . 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S -transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein family. The immunoglobulin 
fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e,g>, cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 
A chimeric or fusion protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotidesof the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
25 operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT InternationalPublicationNo. WO 92/20808, and PCT 
InternationalPublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
30 intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachmentregions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRN A stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase(gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 

30 U.S. Patent No. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApphcationNo. PCT/US90/06436 
(WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244: 1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
refeired to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of ammals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
1 0 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

1 5 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, 'therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays- as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by foe research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutive^ or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels- 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene' 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques- and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding me other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art 
References disclosing such methods include without limitation "Molecular Cloning- A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 

15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 

20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 

30 et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human mterleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto 1991- 
deVnesetal., J.Exp.Med. 173:1205-1211, 1991; Moreauet al., Nature 336:690-692 1988- 
Greenberger et al., Proc Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. CoHgan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.SA. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti J 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol i pp 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement ofmouse and human mterleukin ' 
9-Qarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology 
J. E. Cohgan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991 . 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokme production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober 
Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse' 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095 
1980; Weinberger etal., Eur. J. Immun. 11:405-411, 1981; Takaietal., J. Immunol. 
137:3494-3500, 1986; Takaietal., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be mvolved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells mcluding primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
protems which currently must be obtained from non-human sources or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Fit- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

10 inflammatory protein 1-alpha (MP- 1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

1 5 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embiyonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

3 0 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 
10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224 (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 
15 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth ' 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
25 Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 
biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy ' 
to stunulate the production of erythroid precursors and/or erythroid cells; in supporting the 
growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i e 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

hematopoietic stem cells which axe capable of maturing to any and all of the above-mentioned 

hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of foe present invention abo may be involved in bone, cartilage tendon, 
Ugament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repau and replacement, and in healing of bums, incisions and ulcers. 

A ^ lwti * ofte ^«tovendonwmchmducescarfflageand;orl»neg ro wmin 
5 circumstances where bone is not nonnally formed, has application in the healing of bone 

features and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of foe invention may have 
prophylactic use in closed as weli as open fincttue reduction turn a!so in foe improved fixation of 
artificaljoints. ^o^ton^^^^^^^^^ 

of congenttal, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A ^'W*P*k°f<tomv m tion mi yalsobem 
stimulating growth „f bone . fonning cells, or inducing differantiation of progenitors of ' 

ntflammation or prooesses of tissue destruction (cofiagetn.se activity, osteoclast activfty etc ) 
medtated by inflammatory processes may also be possible using the composition of foe 



invention. 
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Another categoty of tissue regeneration activity font may involve foe polypeptide of foe 
prasen, mvention is tendon/ligament formation Induction of tendou/ugament-like tissue or 
other tissue Ration in circumsteucas where such tissue is no, normally fotmed, has application 
ntfoeheahngoftendon or ligament tears, defomtities and other tendon or Hgameu, defect in 
humans and other animals. Such a preparation employing a tendon/ngament-like tissue inducing 
protem may ^ use to preventfag ^ ^ ^ 

use m foe improved fixation of tendon or ligament to bone or ofoer tissues, and in repairing 
defects ,„ te „d„u or ligament tissue. De novo tendon/figamenMike tissue formatiou induced by 
a composition of the present invention connibutes to foe repair of congenital, trauma induced or 
otatendonorng^defectoof ofoer origh, and is also uaolul in c^netieplaafic surges for 
attachment or repair of tendons or ugament, The composition of foe present invention may 
Provtde environment to attract tendon- or ugament-fomting calls, stimulate ^wfo of tendon- or 
hgamen.-forming cens, induce differentiation of pmgeuitors of tendon- or ligament-fonning 
~Us, or induce growfo of tendonrfigamen. oells or progenitors „ *o for ream ,„ vlv0 to effec , 

carpal tunnel syndtome and ofoer tendon or ligament defecti. The compositions may also include 
an appropriate mattix and/or aequeatering agent as a earner as is well kuowrt in foe art 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting from chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or fester closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 

1 5 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 

regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 

endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 

conditions resulting from systemic cytokine damage. 

25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 

growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 

30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 

Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
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JSS* bc " C1,ica80 • 35 modifled by mi Mertz - j - ^ 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including wimou. limitation the activities for which assays are described 
herem. A polynucleotide of the invention can encode a polypeptide exhibiting such activities A 
protein may be usefhi in the tieatmen, of various immune deficiencies and disorders (mcfuding 
severe combated immunodeficiency (SOD)), e.g„ in regulating (up or down) gmwth and 

10 P^onofTand/orB,ymphocytes,a SW eUase £ fectin g mccy to ,yticactivi W ofNKccUs 
«* other call populations. These immune deficiencies may be genetic or be caused by vim, (e g 
HTV) as well as bacterial or fungal infections, or may result from autoimmune disorders More 
specially, infectious diseases causes by viral bacterial, fungal or other infection may be 
tieatable using a protein of .he present invention, including infections by HTV, hepatitis viruses 
herpes ^.mycobacteria, Wshmani. spp., ma!aria spp. and various fungd infections such' 
as candtdtasis. Of comae, in fhis regard, proteins of me prescn. invention may also be useful 
where a boosffo me immune system generally may be desirable, i.e., in the treatment of cancer 

^"'^fewWchnmybeueatedusingapmtehofmepmsentmvention 
mclude, for example, connective tissue disease, muftipte sclerosis, systemic lupus erythematosa 
rheumatom arthritis, amoimmunepulmomuy inflammation, OuiUain-Bane syndrome 
autmmmune thyroiditis, insnhn dependent diabe.es memos, myasmema gmvis, grafl-versus-hos. 
dtsease and aumimmune inflammatory eye disease. Such a protein (or antagonist, daemof 
mcluding antibodies) of me present invention may also fo be useful in the tieaunen. of aUergic 
mactions and conditions (eg., anaphylaxis, sernm sickness, dmg reactions, food aUergiea, msec, 
venom aflerg.es, mastocytosis, allergic rhinitis, hypemensitivity pneumonitis, urticaria 
angtoedema, ec2e.ua, atopic dermatitis, allergic comae, dennatitis, eryftema multifile 
Stevens-Johnson syndrome, aUergic conjunctivitis, atopic keratoconjunctivitis, venemal ' 
kemtoconjunctivitis, gian, papilla^ conjunctivitis and contact allergies), such as aamma 
(particularly alfergic asmma, or other respimfmy problems. Omer conditions, in which immune 
suppress is desued (including, for example, organ tian^antation), may also be teeamble 
usmg a protein (or antagonists .hereof) of ft. present invention. The mempeutic effects of me 
polypeptides or antagonists .hereof on allege reactions can be evaluated by in vivo animals 

^^-^-^«^-ta^«a^ -<Xta ^ 125: 

998), skm pnek tea. (Hoffmann ef al„ AUergy 54: 445.54, 1999)> ^ ^ ^ 
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test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocknrg antigen function may also be therapeutically usefnl for ^ autoinmme 
dra-ses. Many auto^e disorders are the result of inappropriate activation of T cells tire, are 
^^^^^ wMc hp rom „«eti KproducflMof ^^ auto 
mvoivednrftepaftologyoftnediaease, Preventing ,he activation of autoreactive,^ may 
reduce or ehnrinate disease symptom, Adminiatiarion of reagen* which block Eolation of T 
ceUs can be used to inhibit T ceil activation and prevent production of autoantibodies or T 

reagent may induce antigen-specific tolerance of autoreactive T cells which ootid lead to 
^^^^^^^^^ 

collagen arthritis, diabetes mellitus in NOD mice and BB mts, and murine experimental 
15 MO-SSS)!' 8 8 rav * s (sec Paul ed., Fundamental Immunology, Raven Press, New Yoifc, ,98 9 ,pp. 

ofup.gutognmnunemsponses.mayal.beusemlmtiterapy.Upregulationofim^^^ 
responsesmaybelnmefonn of enhancing an existing immune tesponse or eliciting an initial 

20 ™ e ~ F ° reXmPle -" 8m ^^-W^-Mmcasesofv i ral 
utfeooo. rneludmg systemic vim. diseases soch as influenza, the common cold, and encephalitis 

Alternatively, anti-viml inrmune responses may he enhanced in an infected patient by 
-enrovtng T cefls from the pattern, cosrimularing the T cells in vitro with viral antigen-ptdsed 
^^^^a^deofmepmsemmv^onor^wtmastimdatotyforntof 

5 12 IT of T present imaSon md retattodudn8 ** ta ^ «*« T - «- «- 

pahent. Anoflter metiaod of enhancing anti-vhal immune responses would be to isolate infected 
cells fiom a patent, tmnsfec, then, with a nucleic acid encoding a protein of tire present 
mvention . described herein such taa, the cells expmas aU or a portion of tire protein on flreir 
^-■^.taWeotedce.^.he^^^^^^^ 
capable of dehvetmg a costimulatory signal to, and thereby activate, T colls in vivo 

cell,, A r I ^ de ° f * eP ^ t " Vena0n ^ yPrOVide * en ^«on Si g M ltoT 
ceua to lndnce a x KU mediated ^ ^ ^ ^ ^ ^ t 

addnron.tomor cells which lack MHC class! or MHCdaasnmolec^es, or which fail to 
-xpress sufficient mounts of MHC class I or MHC class U molecule, can be tiansfected with 

MHCclaaalalphach^p^mandfcmicroglobnJmprotemoranl^Cclassnalphaclum 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 

with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

5 an antisense construct which blocks expression of an MHC class II associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 

subject may be sufficient to overcome tumor-specific tolerance in the subject. 

1 0 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

15 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al, J. 
Immunol. 135:1564-1572, 1985; Takai et al, I. Immunol. 137:3494-3500, 1986; Takai et al, J. 
Immunol. 140:508-512, 1988; Bowman etal, J. Virology 61:1992-1998; Bertagnolli etal, 

20 Cellular Immunology 133:327-341, 1991; Brown et al, J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 

25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 

30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 

35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: DarzynMewicz et al., Cytometry 
13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
15 include, without limitation, those described in: Antica et al. Blood 84:1 1 1-1 17, 1994; Fine et al. 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al. Blood 85:2770-2778, 1995; Toki et al, 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTrVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
5 321 :776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKENETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

1 5 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting foimation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 

carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 

neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 

tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 

hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

20 with the polypeptide or modulator of the invention include: Actinomycin D, Axoinoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), IVBtomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 

invention as a potential cancer treatment. These in vitro models include proliferation assays of 

cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 

Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

5 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst, 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boy den Chamber assays as described in 

Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 

cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999).and Li et al., 

10 Clin. Exp. Metastasis, 1 7:423-9 (1999), respectively. Suitable tumor cells lines are available, 

e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
15 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads' 5 via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Cum Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (z.e., increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

1 0 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
. over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 



58 



WO 01/57190 PCT/US01/04098 

arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

1 5 (viii) demyeiinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1 990, J. Neurosci. 1 0:3507-35 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and deluding but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
1 5 , activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
20 effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hoimonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperprohferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
30 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

1 0 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

15 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 
5 The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
• test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 



4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 
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4.11.1 EXAMPLE 

One embodiment of die invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Olug/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.lug/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in me art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutical^ acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, DL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, EL-12, 
IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

20 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12,1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will, be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 

66 



WO 01/57190 PCTYUS01/04098 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 

10 active ingredient solutions, having due regard to pH, isotonicity, stability/and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 

1 5 other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 

20 barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 

25 liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 

35 purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 

10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 

1 5 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 

20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 

30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
5 glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 

1 0 sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

15 polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

20 co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 

25 known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 

30 skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

35 or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium.phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
* 10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 

1 5 those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 

25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 

35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 ug to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0. 1 ug to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
10 composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
1 5 described above, may alternatively or additionally, be administered simultaneously or 

sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protem-containing or other active mgredient-contaihing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. . 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ot and TGF-P), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on of 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
1 5 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
3 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
i population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of a<fcninistration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1 975, in "The 
Pharmacological Basis of Therapeutics", Ch. lp.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drag may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 |ig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 Jig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 apjpropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a t>, F a b' and F( a b-)2 
fragments, and an Fab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 



5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

1 0 recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

15 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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Scatchard analysis of Munson and Pollard, Anal. Biochemu 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco f s Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

1 5 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 . 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin 
Humanization can be performed following the method of Winter and co-workers (Jones et al 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al" 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the " 
5 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539 ) In some 
mstances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residue, Humanized antibodies can also comprise residues which are found neither 
m the recipient antibody nor in the imported CDR or framework sequences. In general the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domamsjnwmchallorsubstanu^^^ 

nnmunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Cmr^SU^Biol 
15 2:593-596 (1992)). kS^ioL, 
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5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
gene, Such antibodies are termed "human antibodies", or "fully human antibodies" herein 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybndoma technique (see Kozbor, et al., 1 983 Immunol Today 4: 72) and the EBV hybridoma 
techmque to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
antibodies may be ^^^^^ b ^^ }mtnM 
human hybridomas (see Cote, et al., 1 983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
Worming human B-cells with Epstein Barr Virus in vitro (see Cole, et al 1985 In- 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques 
including phage display libraries (Hoogenboom and Winter, J.MoLRinl 227-381 (1991V 
Marks etal., JJ^BioL 222:581 (1991)). Similarly, human antibodies can be made by ' 
adducing human immunoglobulin loci into transgenic anin^ls, e.g., ntice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 
mall respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et aL 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger fNature Biotechnology 14. 826 (1996)); and 
5 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

10 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the ianimal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a ^ for , 
epitope on an immunogen, and a correlative method for selecting an antibody mat bin* 

5.13.4 F ab Fragments and Single Chain Antibodies 

.0 ^tfadtea specific to an antigenic protein of the invention (see e.g, U.S. Paten, No. 4,946 778, 
to addition, mefads can be adapted for fa mmaac6oB ^ ^ ' 

Hnse, e, al., .989 Science 246: 1275-1281) «oaH„„ rapid and effective identification of 
monodonal F. b fragment with fa desired specificity for a protein or derivatives, fragments 
a^ogsorhomoiogsfareof. Andbody^entetaconteinfaidiotypes^p^^ 
^bep r od,cedbytechm q ues to ownmfaa rt mc 1 udin g ,bu,no,limited,o:(i)anF ( « 

by redncmg fa disnlf.de bridges of an F M „ fc^ m „ ^ fte 
treatment of fa andbody molecule with papain and a reducing agent and (iv) F, fragment, 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies are monodonal, preferably human or humanized, antibodies that 
have btndtng specificities for a, leas, fwo different antigens. In fa present case, one of the 
htndtng specificities is for an antigenic protein of fa invention. ^ second binding farge, is any 
ofar antigen, and advanf^eously is a cell-surface protein or receptor or receptor subunit 
Methods for making bispecific antibodies are kno™ in fa art. Traditionally fa 
recombtnan. production of bispecific antibodies is baaed on fa co-expression of two' 
■mmunoglobulin heavy-chain/nght-chain pairs, where fa two heavy chains have different 
spectficities (Milstein and Cuelio, Natoe, 305:537.539 (1983)). Because of fa random 

potential mtxhrre often different antibody molecules, of which only one has fa correct 
btspeoific structure, roe purification of fa cormc, molecule is usually accomplished by affinity 
•Aromatography steps. Stmifar procedures are disefosed in WO 93/08829, published ,3 May 
l993,andinTrauneckerera7, 1991 BMBOJ., 10:3655-3659. 

Antibody variable domains widt fa deshed binding specificities (antibody-antigen 
eombtmng sites) can be fused «o tnununogloboUn contfan, domain sequences. The (Won 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzvmology. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

10 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan et al., Science 229:8 1 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab 5 fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 

25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
35 of human cytotoxic lymphocytes against human breast tumor targets . 
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Various techniques for making and isofating bispecfflc 
tecoruo^,^^^^^^ Forexample , bispecfflcantibodies J e 
produced uamg leucine zippera. Kosteln y e,al.,JJr M ^l4 S( 5,:154 7 . 1 5 5 3(I992) The 
eucnre zipper peptides from the Fos and to proteins were linked to fhe Fab' portions of two 
drfferentanhhodieshygeneftsioa The antibody homodhners were reduced a, the hinge region 
P fonn monomers and (hen re-oxidized «o form tire annhody heterodtaers. This method can 
also he utilized for «he production of antibody homodhners. The "diabody- technology 

^v-echanismformatoghispecfflcantih^yti^en^. The fragments comprise a 
10 heavy-cham variable domain (V„) connected to a hgh«-chain variable domain (V L ) by a linker 
- to short ,„ aUow pairing between me two domains on me same chain. LtJ^ 
me V„ and V L domains of one ferment are forced Kpail ^ complement V L and V. 
domams : of anomer fagnaen,, mereby forming two antigen-binding sites. Anotiter strategy for 

15 reported. See, Gmberetm,lJn 1 mun Q L 152:5368 (1994). 

Antibodies with more man two va.enciea are contemplated. Po, example, triapecific 
antibodies can be prepaaed. Tutt et al., JJomunsh 147:60 (1991) 
Ex.mp.ary bispeciflc antibodies eon bind to two different epitopes, at leas, one of which 
ongtoatesmtheproteinantigenoftheinvention. Alternatively, an anti-antigenic arm of an 

leukocyte such as a T-ccll receptor mofccnle (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
10 (FcyR), such as FcvRI (CD64), FcyRU (CD32) and FcvRIH (CD16) so as «o focus ceMar 
defense monism, to the eel. expressing me particular antigen. Bispeciflc antibodies can also 
l ^«°^cyt„«oxicagen Bto ecl.swhichexpreasapartienIaranflgen. These antibodies 
-5 Possess an antigen-hinding arm and an arm which hinds a cytotoxic agen, or a mdionuclide 
che ator, such as EOTUBE, DPTA, DOTA, or TETA. Anodter bispeciflc antibody of interest 
bmds the protem antigen described herein and further binds tissue factor (TF). 
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5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are afso wfftin me scope of the present invention 
Heteroconjugate antibodies are composed of two covalenffy joined antibodies. Such antibodies 
have for example, been proposed to farge, immune system cells to tmwanted cells (U.S Paten, 
No. 4,676,980), and for treahnen, of HIV infection (WO 91/00360; WO 92*00373; EP 03089) 
U .contemplated that fc antibodies can he prepared in vitio using known metitods in synthetic 
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can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 5,13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

1 0 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
20 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enaymatically active toxins and fragments thereof that can be used include 

25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, 131 I 5 131 In, 90 Y,and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexa*ediamine), bis-diazonium derivatives (such as 
bis^azoniumbenzoyO-ethylenediamine), diisocyanates (such as tolyene 2,6-dii SO cyanate) 
and bxs-active fluorine compounds (such as l^-difluoro^^-dinitrobenzene). For example a' 
ncin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987) 
Carbon-14-labeled l-isothiocyanatobenzyl-S-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibodv See 
WO94/11026. y ' 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
admmistered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 
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4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a compute, Such media include but 
are not hnuted to: magnetic storage media, such as floppy discs, hard disc storage medium and 
magneto tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media A skilled 
arbsan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
anucleotidesequenceofthepresentinvention. As used herein, "recorded" refers to a process for 
stormg ^formation on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
mvenuon. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file 
formatted in commercially-available software such as WordPerfect and Microsoft Word or ' 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

10 al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of tlie present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

10 As used herein, "a target structural motif," or "target motif," refers to any rationally 

selected sequence or combination of sequences in which the sequenced) are chosen based on a 
three-dimensional configuration which.is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

1 5 to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple hehx-formation optimally results in a shut-off of RNA transcription 
from DNA while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

1 0 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, a&ninistration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an OKF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 

encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 

the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 

invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

1 5 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 

sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 

activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 

activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 

absence of the compound). Compounds, such as compounds identified via the methods of the 

invention, can be tested using standard assays well known to those of skill in the art for their 

ability to modulate activity/expression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 

and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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^ ^ ^ re in» an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the cormgunnion of the particular protein. For example, one skilled in the art can 

Kaspcaai et al., Biochemistry 28:9230-8 (1 989), or phannaceutical agents, or the like 

In addition to fte foregoing, one class of agents of the present invenhon, as broadly 
descnbed, can be nsed to control gene expression throngh binding to one of dr. ORFs or EMFs 
10 ofthepmsentinvention. As described above, snch agents can be randomly screened or 
rahonally designed/selected. Targenng the ORF or EMF allows a skilled art™ to design 

mnlbpleORFswhichrelyontJreaameEMFforexp^ioncontool. One class of DNA binding 

IS bv V I TIT* ° 0- * * M ftm ' «»»-<« 

^ bmdmg to DNA or RNA. Such agents can be based on the classic phosphodtester 

nbonucleic acid backbone, or can be a variety of soKhydry, or ^ynreric dcrivahves which have 
oase attachment capacity. 

Agents suitable fcr use in these methods preferably contain 20 to 40 bases and are 

20 ^rrr,7r OT ^ toa ^° no ^^ 

Lee eta,., Nucl. Acrds Res. 6,3073 (197% Cooney e, al. Science 241:456 (1988): and Dervan a, 

< I991 > O'^eoxynu^^ Boca 
Raton, FL (1988)). Triple helix-fomumon optimally results in a shut-off ofRNA transcription 
ton: DNA, white antisense RNA hybridizadon blocks teansladoo of anmRNA molecule into 

anftaense or tapte helix oligonucleotide and olher DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
beusedaaadtagnostioagent AgentewhichbindtoaproteinencodedbyonaofmeORFsofthe 
present mvenhon can bo formulated using known techniques to geuemte a phannaceutical 
composition 



4.19 USE OF NUCLEIC ACIDS AS PROBES 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sampje. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

1 5 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, ( 1 990) J. Clin. Microbiol. 28(6) 1 469-72); 
using UV light (Nagatae/a/., 1985;Dahlen<tfa/., 1987; Morrissey& Collins, (1989) Mol. Cell' 
Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 
references being specifically incorporated herein. 

Another strategy mat may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al. (1 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotmylatedprobes, althoughthese are duplex probes,' that are immobilizedon ' 
streptavidin-coatedmagneticbeads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotmylatedprobes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, DL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surfacetermedCovalinkNH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchasedfrom Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussener al., (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenetal., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. ThephosphoramidatebondjoinstheDNAto the 
CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLinkNH via an phosphoramidatebond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M l-methylimidazole, 
pH 7.0 (l-Melnv?), is then added to a final concentration of 10 mM l-Melm?. A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M 1 -e1hyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

1 0 mM 1 -Melni7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hy droxyl groups carried by the support The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodor et al. (1 99 1 ) Science 251 (4995) 767-73 , incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotide to a nylon support, as described by Van Ness et al (1 991), 

requires activation of the nylon surface via alkylation and selective activation of the 5 r -amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease etal, (1994) PNAS USA 91(11) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
S'-protectedi^-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

3 5 generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al. (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schrieferero/. (1990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a usefiil alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, CvzJI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(1 4) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (CvzJI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald etal. (1992) quantitatively evaluated the 
randomness of Ihis fragmentation strategy, using a CvzJI** digest of pUC 19 that was size 
fractionatedby a rapid gel filtration method and directly Ugated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CvzJI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligafion, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
cfuickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microliter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 

1 5 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 

Subarrays may contain 64 samples, one from each patient Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

3 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5 1 sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencerto obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 

5.2 EXAMPLE 2 

Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1969-2951, 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 
used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 1 4, gb pri 
1 1 4, and Uni Gene version 1 0 1 ) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 
extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present invention, and 
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fasta.bioch.virginia.edu') which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83 :63-98 
5 (1990), herein incorporated by reference). MethodB refersto a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositionalproperties (C. s Burge and S. Karlin, J. Mol. Biol, 268:78-94 
( 1 997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

53 EXAMPLE 3 
Novel Nucleic Acids 

1 5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDNA sequences 

and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQIDNO:l-351. The amino acids are SEQ ID NO:985- 1335. 
Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 

25 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1 -35 1 are shown in Table 2 below. . 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 
15 and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
20 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 7, gb pri 1 17, 
UniGene version 1 17, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
25 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
30 version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
. pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
1 0 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
15 . disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5-5 EXAMPLES 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using F ASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO: 175 1-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked usingFASTY and/or BLAST against Genbank(i.e.dbEST version 118,gbpri 118, 
UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1 -965. The corresponding 
amino acid sequences are shown in SEQ ID NO:1915-1949. 

Table 1 shows the various tissue sources of SEQ ID NO: 93 1-965. 

The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
BioL, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked usingFASTYand/orBLAST against Genbank (i.e. dbEST version 11 9, gbpri 119, 
UniGeneversionll9 5 Genpeptreleasell9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
extandgc-zip-2(Hyseq,Inc.). The full-length nucleotide, including sphce variants resulting from 
meseproceduresareshownmmeSequen^ 
amino acid sequences are SEQ ID NO:1950-1958. 

Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 
The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J Comp 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
me eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference 
was obtained for foe polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 



peptide. 

5.8 EXAMPLE 8 
Novel Nucleic Acids 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a Ml length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any 'frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The corresponding 
amino acid sequences are SEQ ID NO:1959-1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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exannned fordomaina with homology * certain peptide domain, Table „ shows the name o, 



105 



WO 01/57190 PCT/US01/04098 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
10 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

15 Tables 5 and 13 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NOS: 


lung 






3 1125 49 65 75 114141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 j 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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Clontech 



brain Clontech f ABR006 



Clontech ABR008 



143 146 148-149 159 163" 



168 174 — 

176 179-180 184-185 188-190202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409-' 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 

933 943-944 947 952-953 958 962-963 
965 967 972 977 

JP 0b 1" 115 126 13TWI72 179T85" 
204 263 273 305 312 323 358 3 80 3 83 
395-396 403 420 428-429 431 461 542 
583 586 606-607 61 1 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 



fT^F 3 ^ 0 12 91 103 118 125T30T- 
131 134 184 224 275 338 350 354 Sei- 
ses 374 384 390 394 396 43 M32 434 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 



2-3 y-n 14 17 21 23-25~28-29 31-35 37" 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317321-322 325-328 
334 336 338 340-342 344 346 348 350- 

352 354 356-358 363 366 369-374 376 
379-38T383-386 388-394 398-399 402- 
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403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 711 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 
876 880 883 885 887 893-898 902 906- 
916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 
962-965 967 969-970 972 977 


adult brain 


Clontech 


ABR011 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABR012 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319322-323 331 341 346 348 
371 374 388 391 394 399 401 409 411 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cultured 
preadipocytes ■ 


Strategene 


ADP001 


4 28-29 69 93 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 41 1 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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adrenal gland Clontech TADR002" 



adult heart GIBCO AHROOf 



'adult kidney GIBCO [AKD001 
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693 729 746 761 Jbb /69 834 842 848 
887 907 923 947-950 957 967 Q6Q 



1 3 12-13 21 23-24 27-29 67 74 78~T03: 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 



13-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221 223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-638 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 711 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 



l j * 12-14 17 19-25 28-29 33-34 37-3~9~ 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-1 10 114- 
116 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
217219221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GIBCO 


ALG001 


1 3 14 18 28-29 38 54-56 59 92 1 10 1 14- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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lymph node 


Clontech 


ALN001 


967 1 

3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 41 1 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GIBCO 


ALVOOl 


3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117119128 130-131 134139 
149 152 169-172 176 184 189-190 200 

209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 711 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949 958 965 969 972-973 


adult liver 
adult liver 


Invitrogen 
Clontech 


ALV002 
ALV003 


3 37 42 56 60 )[ 82 104-105 114-115 

117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
->y*-->yj 0U4-0U5 608 610 621 630- 
63 1 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 81 1 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 qjm 


adult ovary 


Invitrogen 


AOV001 

* 

/* 

2 
3 
3 
3 
3 
4 
4 


6U 134 169-171 275 

1 5 9-10 12-14 lb 18 20 22-25 28-29 33-~ 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 

106108-110113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
211-212 214 217219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
583 285 288 290 293-294 297 301-304 
06-308 311 314 319-322 325-326 328- 
29 331-332 335-338 341-342 344 348 

54-358 361-363 inn 
^ Jwi'juj dod joo J /U-372 374 

76 379-380 382-383 388 394-396 398- 

99 401-402 405-406 409-412 416 418- 

21 423 425-433 438 442-443 449-452 

54 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 


adult placenta 


Invitrogen 


APL001 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASP001 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GEBCO 


ATS001 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 269 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-643 697 699 708 
711 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 




Genomic DNA 
from BAC 
63118 


Research 
Genetics 
(CITB BAC 
Library) 


BAC001 


515 ~ ~ 




Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC002 


640 ~ ■ 




Genomic DNA 
from BAC 
39316 


Research 
Genetics 
(CITB BAC 
Library) 


BAC003 


640 " ■ — 




adult bladder 


Invitrogen 


BLD001 


50 55 66 71 1 1 1 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 


uunc marrow 


Clontech 


BMD001 

i 

i 
$ 

s 
s 
s 


J 10-13 16 18 20-21 25 28-29 31-34 4l"4T 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 51 1 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
538 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
U7-818 822 834 839-840 842 848 862 
566 870 876 885-887 891 896-898 900 
>03 906 913 919 921-922 927-928 939 
>44 947 950 953 959 961-963 967-968 
70 973 977 


bone marrow C 


'lontech E 


IMD002 3 
7 


y-lU 15-19 30 33-34 39 45 54 57 63-64 
1 82 102 116 119130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 411 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 769-771 775-778 784 787 811 
813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLN001 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL016 


358 740 760 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVX001 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 110-111 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 211-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 418-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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diaphragm 



endothelial 
cells 



BioChain DIA002 



Strategene EDT001 



PCT/TJS01/04098 



779-780 784 788 810-811 813-815 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 



3 39 184 203 431 563 848 967 



3 6 8-10 14 19-24 28 
4648 52 55-58 62-65 
80 82-83 87 101-102 
117123-124 128 130 
145-146 149 156 159 
174 176-177 179 181 
194-195 200 203 208. 
219 223-224 226-227 
248-249 254-256 258 
271 274 276-282 285 
301-304 308 311 313 
321 323 325-326 328 
337 339-341 344 348. 
358 361-363 365 367 
380 383 389 394-395 
409-412 425^128 437 
464 466-467 474 479 
500 503 506-509 511 
524 530 532 537 540- 
565 569-570 573 581- 
596 602-608 610-611 
628 630-631 633-637 
650 652 659 661-662 
696 698-699 708 712 
724 727 729 740 745 
765 767-770 772-773 
794 796 802-803 811 
827-828 830 834-835 
859 861-862 864 866- 
887 891 893-894 897- 
907 913 916 921 925 
955 957-958 962-963 
324 515 640 



■29 33-34 37 39 41 
67 69 71-72 75 78 
108-109 114-115 
-133 135 138 143 
-160 167-168 172 
184-187 189-190 
-209 212 216-217 
229 234-235 244 
263-264 267 269 
290-291 294 297 
-314 316-317 320- 
-329 331-332 334- 
-349 352 354-355 
371-372 375 379- 
398-403 405-406 
442-443 448 454 
481490492-498 
517 520-521 523- 
-542 558 561-563 
-583 586 588-589 
613 617-622 625 
642-643 646 648 
682 688 690-693 
715 717 720-722 
748-750 752 761 
779 784 789 792- 
817-818 821 824 
837 842 845 848 
■867 870 876 885 
-898 900 903 906- 
939 947 950 953 
967 973 978 984 



Genomic 
clones from the 
short arm of 
chromosome 8 



esophagus 



Genomic 
DNAfrom 
Genetic 
Research 



EPM001 



fetal brain 



fetal brain 



fetal brain 



BioChain 



ESO002 



Clontech 



FBR001 



Clontech 



FBR004 



97 103 128 371 474 



67 129 156 159 232 267 433 446 503 845 
952 



Clontech 



FBR006 



2S-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 
887 906 958 



10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97 101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 61 1 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 

77^ 7R0-7R1 700 £01 8fi8 818 897 893 ■ 
83S 843 84S 8^6 8^0 864 867 876 8R0 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRs03 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 

K\A £A1 AAl AAR A-^fl £70 £CQ (\Q1 
£QQ 710 71 5 1AO l&X lA^ 74R-74.Q 7^ 

768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHR001 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKD001 


3 31 33-34 38 48 54 72 160 208-209 211 
223 264 269 277 283 290 313 325 341 

1/Lft ISk 3Q6 41 8-470 474 4R4 ^06 ^flR- 
S00 517 570-571 537 547 553 558 567 
569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


ietai Kidney 


iiiviLrogen 


r jsjl/uu / 


"I 1 1 R 1 R6-1 R7 73ft 744 771 439 RR7 060 


fetal lung 


Clontech 


FLG001 


69 132-133 156 168 208-209 217267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


fetal lung 


Invitrogen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 168 186- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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fetal lung 
fetal liver- 



Clontech 



FLG004 



spleen 



Columbia 
University 



FLS001 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 



130-131 394 664 769 942 



3 8-10 12-13 16-17 19-25 27-29 33^BT7 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 21 1-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-411 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 



3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
212 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-411 413 418- 
421 429 431 439-440 442-444 451-452 
457 462.-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 711 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 • 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLV001 


37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 
336 342 348-349 358 371 374 382 394 
402 411-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 715 717 720 
726 745 748 751 769-770 782 791 794 
797 824 830-831 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMS001 


15 27 32 37 67 72 83 99 1 12 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 3 12 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 71 5 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
81 1 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 


Invitrogen 


FSK001 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 111-112 115 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 95a 962-964 967 975 


fetal t i\c\r\ 


inviirogen 




Q 1 OA 111 1 A £Z 1 C\A OA/" O C A 1 /-T A r\f\ a r\ r 

5 1 .30-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSP001 


276 563 842 


umbilical cord 


BioChain 


FUC001 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 

769 774-77S 703 707 807 81 8 899 837 

848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GIBCO 


HFB001 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-63 1 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 711-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 910-911 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMP001 


86 168 186-187 297 537 608 681 761 845 
877 


infant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745- 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-91 1 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 


infant brain 


Columbia 
University 


IB2003 


3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194 208 212 223- 
224 228 230 244 255 259 267 270 273 
276 293-294 3 1 2 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 
524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-736 746 751 754 769 785-786 793 
800 807 811-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 
973 982 


infant brain 


Columbia 
University 


IBM002 


16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 


infant brain 


Columbia 
University 


IBS001 


84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 


lung, fibroblast 


Strategene 


LFB001 


3 11 25 49 65 75 114 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


lung tumor 


Invitrogen 


LGT002 


1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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• 


294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPC001 


3 9-1 1 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 ■ 
167-168 212-213 217 233 255 290 294 
301 305 31 1 314 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 


LUC001 


1 39 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 1 10 1 15-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 211-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 71 1 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 81 1-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 411-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC #CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 715 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 


Invitrogen 


MMG0D1 


1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 

^RO ^SIA ^87 ^$20 ^03 ^07 £fl1 £1A 

612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650^657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- t 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTD001 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 748 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTR001 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 

• 


NTU001 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
658 732 740 765 769 784 791 793 799 
802-803 818 842 851 864 897 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


Clontech 


PRT001 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
505-506 523 537 543 564 583 602-603 
611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


REC001 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland 


Clontech 


SAL001 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 
395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 
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salivary gland 


Clontech 


SALs03 


217 254 270 388 610 


skin fibroblast 


ATCC 


SFB001 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SIN001 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 

Q1 1 Oil C\AO ftCI C\CC\ n^7/T no a 

91 1 9 13 94s 953 959 976 984 




v^ioniecn 


oisJViuu 1 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPC001 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 1 03 112-113 135 160 1 68 1 72 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
43 1 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 61 1 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


oLomacn 


v_.ion.tecn 


CTAAA1 


35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamus 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 

O O /I O A j_" O *7 A TO/I A f\f\ A A s\ f\ A f\ A a f> f 

334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THM001 


10 16 20 28-29 32 37 41 52 57 66-67 74- 

110 1 1 R 171 190-1 31 141 1^1 1^0 

208 21 1 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 41 1-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 71 5 721-723 728 740 766 772- . 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90101 112 115 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 61 1 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 684 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 


Clontech 


THR001 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 211 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 
848-849 862 864 868-869 871 874 876- 
877 887 893-894 896-897 907-909 912 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 


trachea 


Clontech 


TRC001 


33-34 55-56 69 74 163 172 190 209 212 
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zo/ z/U Zy 1 JUj j14 jDZ 413 4ZO-427 
466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


uterus 


Clontech 


UTR001 


4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 411 425 431 434 437 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



SEQ 

n> 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1 


L06175 




occurs m mjwc class i region; ORF 


308 


98 


2 


Y70775 


Homo sapiens 


Follistatin-related protein zfsta. 


3094 


98 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -21 to 
782) 


4112 


100 


4 


AF1 10640 


Homo sapiens 


orphan seven-transmembrane 
receptor 


344 


100 


5 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 


158 


72 


6 


W85607 


Homo sapiens 


Secreted protein clone da228 6. 


1477 


100 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 
hDRR4. 


884 


oo 


8 


Y15227 


Homo sapiens 


Leul 


391 


100 


9 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


10 


X92106 


Homo sapiens 


bleomycin hydrolase 


2445 


100 


11 


Y15228 


Homo sapiens 


Leu2 


445 


100 


12 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


432 


34 


13 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 


U96781 


Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


Ml 6653 


Homo sapiens 


pancreatic elastase IIB zymogen 


1435 


99 


17 


Y13398 


Homo sapiens 


Amino acid sequence of protein 
PR0346. 


1749 


99 


18 


Y02283 


Homo sapiens 


Secreted protein clone br342_J 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
fucosidase, alpha-L-1, tissue (EC 
3.2. 1 .5 1, alpha-l-fiicosidase 
fucohydrolase)) 


2597 


99 


21 


B01384 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-10. 


2470 


100 



127 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


23 


Y55935 


Homo sapiens 


TT. . — T/T TOO i 

Human KHS2 protein. 


A TO 1 

4781 


99 


1 A 

24 


Y55935 


Homo sapiens 


Human KHb2 protein. 


T OAT 

2807 


100 


25 


AC024792 


Caenorhabditis 
elegans 


contams similarity to 1 K:O9502y 


4o3 


31 




xv/yfz. 


/of 


Human secreted protein fragment 


I 54U 


100 


id / 


/OJU 


Homo sapiens 


serine/threonine protein kinase 


1HQ 1 


AO 

98 




AtIjU/jj 


Mus musculus 


microtubule-actin crosslinking factor 




CO 

00 




Ar 1 5U /55 


Mus musculus 


microtubule-actin crosslinking factor 


OTT< 


TA i 

70 


irk 




Mus musculus 


UJVLK-iNy 


OQOO 

2Voo 


oc 

86 


31 


AJ000522 


Homo sapiens 


axonemal dyne in heavy chain 


6058 


99 


32 


Ar 03 725 6 


Mus musculus 


ES2 protein 


2260 


91 


33 


bo2140 


Homo sapiens 


TLS=nuclear RNA -binding protein 


2917 


100 


34 


S62140 


Homo sapiens 


TLS=nuclear'RNA-binding protein 


2890 


98 


36 


AB038237 


Homo sapiens 


G protein-coupled receptor C5L2 


1767 


100 


37 


D79994 


Homo sapiens 


similar to ankyrin of Chromatium 
vinosum. 


6089 


99 


38 


X63380 


Homo sapiens 


serum response factor-related protein 


1966 


99 


39 


AL022072 


Schizosacchar 
omyces pombe 


lipoic acid synthetase 


1067 


61 


40 


J03930 


Homo sapiens 


alkaline phosphatase 


2751 


100 


41 


AF132968 


Homo sapiens 


CGI-34 protein 


1088 


98 


42 


AL1 17637 


Homo sapiens 


hypothetical protein 


2208 


100 


43 


AL021393 


Homo sapiens 


bK747E2.1 (novel protein) 


1526 


100 


44 


X68011 


Homo sapiens 


ZNF81 


1886 


100 


45 


AC002464 


Homo sapiens 


organic cation transporter; 50% 
similarity to JC4884 (PID:g2 143892) 


2423 


100 


46 


W78245 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 19. 


1949 


100 


47 


Y41765 


Homo sapiens 


TT TXf\ /~\ -t Ann j • 

Human PRO 1083 protem sequence. 


3604 


100 


48 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; CLIC4 


1305 


99 


50 


U09413 


Homo sapiens 


zuic finger protein ZNF135 


1361 


57 


51 


AF061812 


Homo sapiens 


keratin 16 


2374 


100 | 


52 


W63681 


Homo sapiens 


Human secreted protein 1. 


1326 


99 


53 


AB035303 


Homo sapiens 


cadherin-10 


4094 


100 


54 


A 12022 


synthetic 
construct 


MRP-8 


485 


100 


55 


AT 1 t t o f\T 

AL121897 


Homo sapiens 


bA392Ml8.3 (KIAA0180) 


1867 


100 


56 


Y73330 


Homo sapiens 


HTRM clone 397663 protein 
sequence. 


818 


96 


en 

57 


ATI CI A1 O 

Arl51018 


Homo sapiens 


HSPC184 


955 


100 


CO 

58 


At 125042 


Homo sapiens 


bisphosphate 3 '-nucleotidase 


1586 


100 


CO 

59 


ATI 1 O/CTA 

Ar 118670 


Homo sapiens 


orphan G protein-coupled receptor 


1971 


100 


60 


"V A A AC\A 

X04494 


Homo sapiens 


precursor polypeptide 


1903 


100 


61 


AF208865 


Homo sapiens 


EDRF 


528 


100 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260665 


Homo sapiens 


histone acetyltransferase 


1510 


100 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


65 


AJ277145 


Homo sapiens 


ras-related small GTPase RAB 1 8 


1073 


100 


66 


Y94950 


Homo sapiens 


Human secreted protein clone 
dhl073 12 protein sequence SEQ ID 
NO: 106. 


348 


100 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
associated protein (DRASP). 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28D1.1) 


3196 


100 



128 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


70 


AJ276316 


Homo sapiens 


zinc finder nrotein ^04. 


1 7^1 
1/31 


52 


71 . 


Y18314 


Homo sapiens 


Daraoleein-like nrotein 


*r l*fO 


oo 
99 


72 


AF157028 


Homo sapiens 


protein phosphatase methylesterase-1 


2017 


100 


74 


Y71082 


Homo sapiens 


xxui i ifiu XJ aggica&lVC lympiiuiUa. 

(BAL) protein. 


1 /OJ 


99 


75 


AF225420 


Homo saniens 


AD025 


15** 


100 


76 


X95235 


Homo sapiens 


trails en nttnn fartor AP9 




100 


77 


AF108420 


Takifugu 
rubripes 


1-aminocyclopropane-carboxilate 
wnthase 


733 


56 


78 


GO 1349 


Homo sapiens 


Human secreted protein, SEQ ID 
NO- 5430 


650 


99 


79 


AL1 17635 


Homo sapiens 


hypothetical protein 


922 


99 


81 


Z85986 


TTnmo cnnipnc 


tu i uoivi i .o ^similar to yeast 

simnre^sor nrotpin ^T? P4fl^ 


O /I c 

865 


77 


82 


AF183414 


Homo sapiens 


hemin-^PTl^itivp fniriatinn -fartrvr 7a 

kinase 




99 


83 


G01143 


Homo sapiens 


Human secreted nrotein ^PO TP) 
NO: 5224. 




no 

98 


84 


U03985 


Homo sapiens 


N-ethvlma1eimirfe-<;pn*iitivp fartnr 


J /HH 


on 
99 


85 


Y17791 


Homo sapiens 


VAX2 protein 


IH^O 


1 oo 


87 


AF263538 


Homo sapiens 


growth differentiation fartnr ^ 


ly^w- 


nn 
99 


88 


Y19757 


Homo sapiens 


SEO ID NO 47 5 from WOQQ77743 




1 nn 
1U0 


89 


AF161493 


Homo sapiens 


HSPC144 


1185 


100 


90 


AF161493 


Homo sapiens 


HSPC144 


OJO 


100 


91 


B25780 


787 


Human secreted protein SEQ ID 


647 


| 41 


92 


U57344 


lVAUo i-LIUoL'LLLLlo 


JLY1C120 


1007 


89 


93 


AF172854 


Homo <?anienQ 


v^ai uiuu upiini-iiA.c cyxoKine i»/i_»u 


1197 


98 


94 


AL390114 


Leishmania 
major 


extremely cysteine/valine rich 

ULClil 


223 


29 


95 


AB016886 


Arahidon^iQ 
thaliana 


cuuiaiiib siinuanty to adenylate 
kinase~gene_id:MCA23. 1 8 


287 


38 


96 


AC005525 


Homo ^anipni 




1855 


96 


97 


B20997 




xiuiiidii iiLicicic aciu-Dinuing protein, 
NuABP-1. 


3836 


99 


98 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


507 


70 


99 


AF 172264 


Homo ^anif»n^ 


lraiz duu iNoiv mteracting Kinase, 

Qnlif vflnnnt 1 
ojpuwc VdlidJ.ll 1 


6942 


99 


100 


LI 1239 


Homo sapiens 


homeobox protein 


717 


100 


i 101 


AC004890 


T-Tomn cnnipnc 


similar to zinc finger proteins^ 
similar to AAC01 956 

\L 1XJ .)£ y £O i +3 III) 


2154 


98 


102 


AC003682 


Homo sapiens 


R28830 2 


1287 


48 


103 


AF201839 


Rattus 
norve&icu^ 


uynamiii ulldd lsoiorm 


4270 


95 


104 


Y79510 


Homo sapiens 


xiuiiicui \*ai uuiiyuiaLC-aobOvlalCu 

Drotein CRBAP-6 


1394 


100 


105 


Y79510 


Homo sapiens 


Human carbohydrate-associated j 
nrotein CRT5AP-6 


1209 


90 


106 


AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


108 


X97260 


Homo saniens 


l^/fetallnthirvn^in 7 


351 


100 


109 


AL034422 


Homo sapiens 


dJl 141E15.2 (novel protein) 


433 


100 


110 


AF191338 




anaphase -promoting complex subunit 


683 


100 


111 


AL021712 


Arabidopsis 
thaliana 


putative protein 


185 




112 


AF250138 


Homo sapiens 


small stress protein-like protein 
HSP22 


1063 


100 


113 


AL109976 


Homo sapiens 


dJ794I6.1.1 (novel protein) 


4176 


99 


114 


Y36151 


787 | Human secreted protein 


668 


100 



129 



WO 01/57190 



PCTYUS01/04098 



SEQ 

m 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


11 J 


At 1 10j99 


Homo sapiens 


elongation factor Ts 


1666 


100 


116 


AF210317 


Homo sapiens 


facilitative glucose transporter family 
member CjLUT9 


2052 


99 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 protein 
sequence. 


931 


100 


1 lo 




Homo sapiens 


catalase 


2846 


100 


119 


ATI yl T7 1 *T 

Ar 147717 


Homo sapiens 


ubiquitin C-terminal hydrolase 
UCH37 


1695 


100 


120 


X73882 


Homo sapiens 


microtubule associated protein 


3801 


i 99 


121 


AC004882 


Homo sapiens 


similar to CAA1 6821 
(PID:g3255952) 


3223 


100 


122 


M93311 


Homo sapiens 


metallothionein-III 


421 


100 


123 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


557 


94 


124 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


222 


53 


125 


AF232009 


Homo sapiens 


peroxisomal trans 2-enoyl CoA 
reductase 


1565 


99 


126 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 


1832 


99 


128 


Y10319 


Homo sapiens 


carnitine carrier 


1592 


100 


129 


U75467 


Drosophila 
melanogaster 


Atu 


937 


36 • 


130 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


494 


87 ' — 1 


131 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


938 


100 


132 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


4818 


95 


134 


M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 


135 


U72970 


Sus scrofa 


calcium/caimodulin-dependent 
protein kinase II isoform gamma-B 


2723 


99 


136 


G03213 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7294. 


450 


100 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 
member 24 


627 


99 


loo 
lio 


A T?1 C CCA O 

Ar 155648 


Homo sapiens 


putative zinc finger protein 


5855 


92 


139 


a n a a a o 

Ar 144638 


Homo sapiens 


sphingosine-1 -phosphate lyase 


2977 


' 100 


140 


AF152318 


Homo sapiens 


protocadherin gamma Al 


4778 


100 


141 


B08517 


Homo sapiens 


Amino acid sequence of a beta- 
tubulin antigen. 


5841 


100 


1 /II 

142 


*VCCC/ZH 

A56667 


Homo sapiens 


calretinin 


1410 


99 j 


1 A1 

14j 


A9z/o3 


Homo sapiens 


tafazzins 


1605 


100 


144 


Y 95293 


Homo sapiens 



Human GEF containing NEK-like 
kinase substrate sGNK. 


4092 


99 


1 AS: 

14j 


AJr220U40 


Homo sapiens 


uKU03 


1198 


100 


140 


M22o// 


Homo sapiens 


cytochrome c 


554 


98 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 


1 A O 

148 


AB026491 


Homo sapiens 


PICK1 


2114 


98 


149 


ABO 18580 


Homo sapiens 


hluPGFS 


1699 


100 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


13 1 


ArZOODKJD 


Mus musculus 


pseudouridine synthase 3 


2135 


84 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 



WO 01/57190 



SEQ 
ID 
NO: 



155 



156 



157 



158 



159 



160 



161 



162 



163 



164 



165 



ACCESSION 
NUMBER 



AF141315 



AF1 10645 



AF159297 



AL133325 



AF073298 



AC004858 



SPECIES 



PCTYUS01/04098 



Homo sapiens 



Homo sapiens 



Zea mays 



Homo sapiens 



Homo sapiens 



Homo sapiens 



ABO 12 109 



AL162751 



AJ005698 



AF1 17646 



AC004002 



Homo sapiens 



Arabidopsis 
thaliana 



Homo sapiens 



Homo sapiens 



Homo sapiens 



alpha- 1,4-N- 

acetylglucosaminyltransferase 



SMITH- 
WATERMAN 
SCORE 



candidate tumor suppressor p33 
ING1 homolog 



extensin-like protein 



dJ984P4.3 (Homeobox protein 
NKX2B) 



small EDRK-rich factor 2 



Ul small ribomicleoprotein 1SNRP 
homolog; match to PID:g4050087 



APC10 



putative protein 



poly(A)-specific ribonuclease 



long CBL-3 protein 



similar to ciliary dynein beta heavy 
chain; 78% Similarity to P23098 
(PID:gl 18965) 



1842 



1294 



238 



1437 



294 



4032 



990 



194 



3351 



2547 



5065 



% 

IDENTITY 



100 



99 



25 



100 



100 



100 



100 



32 



100 



99 



100 



167 



168 



169 
170 



172 



174 



AF126484 



Homo sapiens 



AF161518 



Homo sapiens 



human metallothionein-Ie 



CARD4 



M64983 



Homo sapiens 



HSPC169 



M64983 



Homo sapiens 



Homo sapiens 



fibrinogen beta chain 



AF078845 



Gallus gallus 



fibrinogen beta chain 



AC004774 



Homo sapiens 



fibrinogen beta chain 



Z98974 



Homo sapiens 



1 6. 7Kd protein 



Dlx-6 



175 



176 



177 



178 



Schizosacchar 
omyces pombe 



W74726 



Plasmodium 
falciparum 



putative vacuolar protein sorting- 
associated protein 



liver stage antigen 



AJ222967 



Homo sapiens 



AC024796 



Homo sapiens 



Caenorhabditis 
elegans 



Human secreted protein fg949 37 



cystinosin 



contains similarity to TR:076167 



381 



4961 
1604 



2482 



2679 



1059 



786 



923 



185 



283 



1879 



1920 



221 



100 
100 



100 



100 



100 



78 



100 



100 



31 



23 



300 



100 



27 



180 



AF151803 



Homo sapiens 



181 



G02694 



Homo sapiens 



Membrane-bound protein PR0276. 



CGI-45 1 



Homo sapiens 



protein 

Human secreted protein, SEQ ID 
NO: 6775. 



1370 



215 



283 



100 



28 



100 



183 



184 
185 



AF234765 



AF151855 
AF289664 



Homo sapiens 



Rattus 
norvegicus 



Human cell death preventing kinase 
(DPK-1) protein sequence. 



Homo sapiens 



Mus muscuius 
Homo sapiens 



serine-arginine-rich splicing 
regulatory protein SRRP86 



CGI-97 protein 



CYLN2 



2676 



1214 



4673 



100 



27 



96 



dJ1042KI0.2 (supported by 
GENSCAN, FGENES and 
GENE WISE) 



4059 



100 



Homo sapiens 



188 



189 



190 



192 
T9T 



X83543 



AF059569 



Homo sapiens 



M18135 



Homo sapiens 



AF242194 



Rattus 
norvegicus 



D30689 
Y44984 



Drosophila 
melanogaster 



Bacillus 
subtilis 



dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENE WISE) 



APXL 

actin binding protein MAYVEN 



smooth-muscle alpha tropomyosin 



brakeless-B 



subunit of nitrite reductase 



2332 



100 



8513 



3106 



99 



1306 



147 



113 



99 



95 



52 



29 



Homo sapiens | Human epidermal protein-1 . 



538 



97 



131 , 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


194 


B25679 


Homo sapiens 


Human secreted protein sequence 
encooea oy gene id oe.ki jjj inv/.oo. 


760 


100 


195 


AB020315 


TOT 


homologue of mouse dkk-1 gene:Acc 


1 Ajf\(\ 
1*K)0 


1 f\f\ 


196 


U35730 


Mus musculus 


jerky 


ZUZ1 


/5 


197 


AL136450 


Homo sapiens 


dJ510O21.1 (novel protein) 


632 


• 100 


198 


X56203 


Plasmodium 
falciparum 


liver stage antigen 




24 


199 


Y70775 


Homo sapiens 


Follistatin-reJated protein ztsta. 


202/ 


63 


200 


X87237 


Homo sapiens 


a-glucosidase I 


A A AH 

444/ 


on 


201 


AF101078 


Caenorhabdhis 
elegans 


CLU-1 


1393 


46 


202 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


6611 


100 


203 


X00474 


Homo sapiens 


pS2 precursor 


466 


100 


204 


AB 02933 3 


Halocynthia 
roretzi 


HrPET-1 


974 


54 


205 


AF146019 


Homo sapiens 


hepatocellular carcinoma antigen 
gene 520 




1 HA 
1UU 


206 


AF071002 


Homo sapiens 


muiK-related peptide 1 ; MiKJrl 


OJ2 


1 HA 
1UU 


207 


AB038162 


Homo sapiens 


treioil iactor 2 


/44 


i Art 


208 


U30521 


Homo sapiens 


T>^ 1 1 TTT TJl A 

P311 HUM 


Jo J 


1 AA 
100 


209 


AB000911 


Sus scrofa 


ribosomal protein 


782 


100 


210 


AB021227 


Homo sapiens 


membrane-type-5 matrix 
metalloproteinase 


3545 


1 Art 

100 


211 


AF 180920 


Homo sapiens 


cyclih L ania-6a 


2722 


1 Art 

100 


212 


AF105365 


Homo sapiens 


K-Cl cotransporter KCC4 


5624 


1 AA 

100 


213 


1 t tr\r\r\ a a 

U29244 


Caenorhabditis 
elegans 


similar to human (TRJE) transforming 
proteui (PlR:b22l57) 


AO 

602 


32 


214 


AL033538 


Homo sapiens 


OJ477H23.1 (novel protein) 




1 AA 
100 


215 


X52011 


Homo sapiens 


muscle determination factor 


1262 


100 


216 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


1 AA 

100 


217 


AF006751 


Homo sapiens 


ES/130 


4793 


OA 

99 


218 


AB007859 


Homo sapiens 


KIAA0399 protein 


*y cca 

3559 


AA 

99 


219 


AK026291 


Homo sapiens 


unnamed protein product 


526 


1 AA 

100 


221 


Y84045 


Homo sapiens 


Splice variant of cancer associated 
polypeptide CHI -9a 11-2. 


5851 


97 


222 


Z67996 


Homo sapiens 


tenascin-R (restrictin) 


7 loo 


1 AA 
100 


223 


AF134802 


Homo sapiens 


cofilin isoform 1 


846 


100 


224 


Y17711 


Homo sapiens 


atopy related autoantigen CALC 


1611 


99 


225 


AF 190051 


Gallus gallus 


hepatocyte nuclear factor la 
dimerization cofactor isoform 


443 


81 


226 


AK026256 


Homo sapiens 


unnamed protein product 


OOO 


no 
yo 


227 


Z69368 


Schizosacchar 
omyces pombe 


■nuf2-like coiled-coil protein 


2JU 


2j 


228 


A T7T7 CC\ A O 

AF275948 


Homo sapiens 


A A 1 

ABCA1 


1 1 /Oj 


oo 


229 


AF161384 


Homo sapiens 


HSPC266 


20UO 


oo 

y<$ 


230 


Y16270 


Homo sapiens 


paralemin 


1951 


100 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


2379 


AA 

99 


232 


W88499 


Homo sapiens 


Human stomach carcinoma clone 
HP 104 12 -encoded protein. 


1545 


99 


233 


AF096286 


Mus musculus 


pecanex 1 


3623 


93 


234 


V64619_cd 

i 

l 


Homo sapiens 


30-NOV-1990 Human HE1 cDNA. 


796 


1 /\A 

100 


235 


V64619 cd 
1 


Homo sapiens 


30-NOV-1990 Human HE1 cDNA. 


470 


98 


236 


AF227258 


Bos taurus 


RPGR-interacting protein- 1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684024.2 (prodynorphin (Beta- 


1330 


100 



132 



WO 01/57190 

, PCT/US01/04098 



SE< 
ID 
NO 


l | ACCESSIOI 
NUMBER 


* SPECIES 


OESCRIF110N 


SMITH- 
WATERMAN 
SCORE 


% 

IDblN 1 ITY 


~239 


AF262027 


Homo sapiens 


Neoendorphin-Dynorphin precursor 
Proenkephalin B precursor)) 
eIF-5A2 


> 




240 
241 


AL079344 
AC002394 


thaliana 


• putative protein 


808 
394 


100 
33 


242 


AJ271361 


Homo sapiens 


Gene product with similarity to 
dynein beta subunit 


1542 


51 


243 




Takifugu 
rubripes 


KKANK2 protein " ~ 


303 


30 


244 
245 


AL021918 
AF190167 


Homo sapiens 
Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 

membrane associated protein SLP-2 


1476 
1736 


48 
99 


246 
247 


Y10601 
AL 12 1771 


Homo sapiens 
Homo sapiens 


ankyrin-like protein 
dJ548G19.1.1 (novel protein 
(ortholog of mouse zinc finger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK001596)) 
(isoform 1)) 


5877 
3628 


100 
100 


1 vl O 

248 


. L25314 
X63745 


Drosophila 
melanogaster 
Homo sapiens 


actin-related protein " 
KDEL receptor 


984 


47 


249 
250 


AF1 12208 


Homo sapiens 


13kDa differentiation-associated 
protein 


1 1095 
816 


100 
100 


251 


AP001707 
AL136125 


Homo sapiens 
Homo sapiens 


human gene for claudin-8, Accession 

No.AJ250711 

dJ304B14.1 (novel protein) 


1172 


100 


253 


AL031186 
Y17531 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


bK984Gl.l (supported by FGENES) 


778 
1 JO 

532 


100 
100 


254 
255 
256 


AL049843 
AJ242972 
Y94873 


Human secreted protein clone BL205 
14 protein. 

OJo92M17.3 (K1AA0349 protein) 

TOLL1P protem 

Human protem clone HP02632 


639 

6741 J 
1424 


100 

99 
99 


ZD / 

258 
259 

260 
261 


AF279865 
AL024498 
R66278 

AF101784 
AF101784 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


kinesin-like protein GAKIN 
dJ417M14. 1 (novel protein) 
Therapeutic polypeptide from 
glioblastoma cell line, 
b- 1 RCP van ant E3RS-IkappaB 
o- 1 KCP variant E3RS-lkappaB 


10/0 

2903 
589 
830 

3226 


100 
100 
100 
100 

99 


262 
263 

264 


AF101784 
API 97060 


HoTnn Qflm'pnc 

Homo sapiens 


b- 1 RCP variant b3RS-IkappaB 

src homology 3 domain-containing " ' 

protein HIP-55 


2821 
3149 
2257 


100 
99 
100 


265 
266 


Y86262 

Y56966. 
Y56966 ~ 


Homo sapiens 
tiomo sapiens 


Human secreted protein HAQAR23 " " 

SEQIDNO:177. 

Human SBPSAPL polypeptide. 


766 
2779 


100 
100 


267 

268 
269 
270 
271 


AJ300465 ] 

AC004030 ] 
X55954 ] 
AB033921 f 


tiomo sapiens 

< 

-iomo sapiens ] 
iomo sapiens 1 
rtus musculus J 


numan 5tf i'SAFL polypeptide 

Dutarive white family ATP-binding 

cassette transporter 

^1856 2 

-IL23 ribosomal protein ~ 
Idrl related protein Ndr2 


1018 
1557 

3579 
714 

1 oDj 


99 
95 

99 
100 
94 


272 
273 
274 

275 


AF081886 I 
AF166492 I 
AL022238 f 
W88667 f- 

X00129 E 


lomo sapiens I 
lomo sapiens s 
Iomo sapiens ~d 
lomo sapiens S 
1 

Lomo sapiens p 


sKUl-like protein 
mall GTPase RAB6B 
J1U42K10.4 (novel protein) 
ecreted protem encoded by gene 
34 clone HAIBP89. 
recursorRBP 


1905 
1060 
2201 


99 
100 
100 

Aft 

99 


276 Z 

277 J 


47500_cdl H 
UB049188 E 


omo sapiens "T 
se 

\uus caballus ul 


l-MAY-1998 Human RHOH gene 

jquence. 

Jiquinn U-terminal hydrolase 


1044 
1161 

1118 


97 
100 

96 
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SEQ 


ACCESSION 


SPECUES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








5COR1L 




2/o 


Ar270o47 


■ — : 

Homo sapiens 


PTT1 
\Jl 1 1 


1 J 04 


1 AA 

100 


Z/y 


An a TOC/r 


Mus musculus 


coronin-2 


1/1 1 A 


94 


280 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


103 1 


100 


282 


D83948 


Rattus 


Sl-1 protein 


3975 


90 






norvegicus 








TOO 

283 


Y147oo 


Homo sapiens 


I Kappa B-like protein 


203 / 


100 


Zoo 


A t ao 

AL031316 


Homo sapiens 


aJ28<Jl0.3(HoDl 1B1 


294 


1 AA 

100 








(hydroxysteroid (1 1-beta) 












dehydrogenase 1) 






. 287 


D641Q9 


Homo sapiens 


too family 


1773 


99 


288 


AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 


M61866 


Homo sapiens 


Krueppel-related DNA-binding 


209 


90 








protein 






290 


a -r/"v t\ 1 o 1 r\ 

AJ001810 


Homo sapiens 


mRNA cleavage factor 1 25 kDa 


1217 


100 








subunit 






291 


Y99454 


Homo sapiens 


Human PRO 1 605 (UNQ786) amino 


694 


100 








acid sequence SEQ ID NO:395. 






292 


Y44824 


Homo sapiens 


Human molecule associated with cell 


2370 


100 






proliferation, MACP-4. 






293 


AJ276101 


Homo sapiens 


GPRC5B protem 


2099 


100 


294 


AF161406 


Homo sapiens 


HSPC288 


719 


100 


295 


Y58628 


Homo sapiens 


Protein regulating gene expression 


1276 


100 








PRGE-21. 






296 


U91561 


Rattus 


pyridoxine 5-phosphate oxidase 


1239 


87 






norvegicus 








297 


L02956 


Xenopus 


ribonucleoprotein 


1624 


83 






laevis 








298 


AF226730 


Homo sapiens 


Cytl9 


1729 


99 


299 


AF226730 


Homo sapiens 


Cytl9 


906 


98 


300 


Y54324 


Homo sapiens 


Amino acid sequence of a human 


718 


89 








gastric cancer antigen protein. 






301 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase 


1606 


100 








isoform 






302 


Y32206 


Homo sapiens 


Human receptor molecule (REC) 


1676 


98 








encoded by Incyte clone 2825826. 






303 


AF247565 


Homo sapiens 


hepatocellular carcinoma associated 


525 


100 








ring finger protein 






304 


A T^O AO O /I A 

AF208844 


Homo sapiens 


n X r AAO 

BM-002 


428 


100 


305 


AC004983 


Homo sapiens 


similar to PID:g3 877944 


1988 


100 


306 


ATi nmo 

AL132978 


Arabidopsis 


putative protein 


210 


25 






thaliana 








307 


Y1053O 


Homo sapiens 


olfactory receptor 


1645 


100 


"5 AO 

308 


API 80681 


Homo sapiens 


guanine nucleotide exchange factor 


3597 


100 


309 


AF 111856 


Homo sapiens 


sodium dependent phosphate 


3591 


99 








transporter isoform NaPi-3b 






1 i n 
31U 


Y 1 J->o3 


— — : : 

Homo sapiens 


G-protein coupled receptor 


21/1 


1 AA 

100 


1 1 
311 


Z/3420 


Homo sapiens 


-*T7 1 A /111 1 f\ 1 A^AJIUJLJL t . . ,.,.,1 „ 

obi4ouiu.2 v. me rcaptopyruvate - 


1 CAO 

1598 


1 AA 

100 








sulrurtransferase (EC 2.8.1.2)) 






312 


X/y33j 


— — : 

Homo sapiens 


beta tubulin 


H A O 

2348 


1 r\r\ 

100 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo sapiens 


SURF-4 


1395 


100 


317 


Z37986 


Homo sapiens 


phenylalkylamine binding protem 


1258 


100 






Macaca 


hypothetical protein 


2Do 


82 






fascicularis 








321 


Y25755 


Homo sapiens 


Human secreted protein encoded 


1440 


100 








from gene 45. 






322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein 


274 


49 
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349 



AL032631 



350 



351 



U70669 



Arabidopsis 
thaliana 

Caenorhabditis 
elegans 



Homo sapiens 



putative cleavage and " 
polyadenylatio n specifity factor 
Y106G6H.8 * — 



194 



* as-ligand associated factorX 
Amino acid sequence of a potassium" 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




AUUU7153 


-— — : 

Arabidopsis 

ulailalla 


IUUujZ 


1 JO 


24 


364 


AF197927 


Homo sapiens 


AF5q31 protein 


3992 


99 


jOj 


ty^cc-aa 

Uzfi DUU 


Homo sapiens 


mitochondrial isoleucine tRNA 
synthetase 


4Z50 


AO 

98 


jOO 


Ay/ooo 


Homo sapiens 


arylsulphatase 


"21/11 

J 141 


98 


K7 


AT l£OA/!Q 

A1/10ZU4© 


Homo sapiens 


hypothetical protein 


1 coo 
1532 


100 




JLoOUOZ 


Mus musculus 


steroidogenic acute regulatory 
protein 


1 OQ 


25 


Joy 


A 171 1 lO/IO 


Homo sapiens 


multiple domain putative nuclear 
protein 


1 ATO 

1022 


59 


J /U 


JVllDooo 


Bos taurus 


endozepine-related protein precursor 


7/l7< 
Z4Z5 


OA 

84 


371 


X66363 


Homo sapiens 


serine/threonine protein kinase 


2562 


100 


J /Z 


W /4oUZ 


Homo sapiens 


Human secreted protein encoded by 
gene is clone Mov£bt>zj. 


1532 


89 


Oil 

•J / J 


At 1UU / /Z 


_ — v , 
Homo sapiens 


ten as c in -Ml 


1 1 coc 

11535 


99 


374 


. AF090934 


Homo sapiens 


PRO0518 


382 


100 


375 


AJdU21643 


Homo sapiens 


gonadotropin inducible transcription 
repressor-3 


2761 


99 


376 


AB049758 


Homo sapiens 


MA WD binding protein 


1331 


100 


ill 


AF070o6o 


Homo sapiens 


Kruppel-associated box protein 


466 


97 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 
p62 


464 


60 


379 


AF149205 


Mus musculus 


Su(var)3-9 homolog Suv39h2 


1690 


88 


*5 OA 

380 


Ar 227906 


Homo sapiens 


UDP-glucose :gly coprotein 
glucosyitransferase 2 precursor 


7851 


99 


381 


AF1 18566 


Mus musculus 


hematopoietic zinc finger protein 


1769 


92 


382 


AK000619 


Homo sapiens 


unnamed protein product 


810 


100 


383 


AF227906 


Homo sapiens 


UDP-glucose:glycoprotein 
glucosyitransferase 2 precursor 


7851 


99 


384 


AF1 17946 


Homo sapiens 


Link guanine nucleotide exchange 
factor II 


2363 


100 


385 


AF125390 


Drosophila 
melanogaster 


L82G 


139 


41 


386 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06 19x protein sequence SEQ ED 
NO:20. 


1092 


50 


387 


T T1 Olftf 

U18795 


Saccharomyce 
s cerevisiae 


Yel064cp 


206 


28 


388 


AF177388 


Homo sapiens 


cancer-amplified transcriptional 
coactivator ASC-2 


10748 


99 


o oa 

389 


AJ002744 


Homo sapiens 


UDP-GalNAc:polypeptide N- 
acetylgalactosaminyltransferase 7 


3469 


96 


0 AA 


AF097366 


Homo sapiens 


cone sodium-calcium potassium 
exchanger 


3166 


100 


391 


AF217525 


Homo sapiens 
- 


Down syndrome cell adhesion 
molecule 


5337 


60 




UoJOJj 


Rattus 
norvegicus 


ankyrin binding cell adhesion 
molecule neurofascin 


3967 


91 




A05ZZ4 


Gallus gallus 


neurofascin 


A AA*7 

4097 


78 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 

1 A A CI C\ 

-19 to 4525) 


4292 


99 


395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


jyo 


/VDU1 /UZO 


jvius musculus 


oxysterol-binding protein 


O 1 11 

Zl 15 


AO 


397 


AL035587 


Homo sapiens 


dJ475N16.4 (KIAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 
gene 85 clone HSDFV29. 


722 


92 


399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 
(HYDRL-8). 


1637 


99 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






elegans 


i i > • ii /i\r 

carboxyl-terminal hydrolase (Pram: 
UCH-l.hmm, score: 28.46) (Pram: 
UCH-2.hmm, score: 47.53) 






AAA 

444 


D78017 


Rattus 
norvegicus 


XTT7T A 1 

NFI-A1 


2667 


98 


A A ^ 

44!) 


ALU4y5o9 


Homo sapiens 


aJi7C10.3 (novel A I rase) 


2418 


100 


/M C 

44o 


AJ24254U 


Volvox carteri 
i. naganensis 


hydroxyproline-rich glycoprotein 


165 


34 






Homo sapiens 


£*pirZj i protem 




1 AA 


450 


AJ133352 


Homo sapiens 


ZNF237 protein 


1025 


96 


451 


Arl7U7Uo 


Homo sapiens 


1 -box protem rJBX3 


3700 


99 


4jz 


A VAAOnOA 

AKOU/OoO 


Homo sapiens 


unnamed protein product 


1546 


99 


45J 


L32977 


Homo sapiens 


Rieske Fe-S protein 


1239 


93 


/I CA 

454 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


1533 


57 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 

T PPT T" 1 A C\C\ 

clone HTLFA90. 


1453 


99 


456 


AB006631 


Homo sapiens 


The human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc fmger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U97002 


Caenorhabditis 
elegans 

• 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases;Pfam 
domain PF00441 (Acyl-CoA dh), 
Score=57.4, E-value=1.7e-16,N=2; 
contains similarity to Pfam domain 
PF00702 (Hydrolase), Score=57.4, 
E-value=le-13,N=l 


583 


37 


461 


AK023I14 


Homo sapiens 


unnamed protein product 


1041 


99 


462 


M93134 


Friend murine 
leukemia virus 


pol protein 


289 


44 


463 


AF055473 


Homo sapiens 


GAGE-8 


232 


47 


466 


Y51415 


Homo sapiens 


Human wild type pKe83 protein. 


2625 


100 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 


468 


Y57936 


Homo sapiens 


Human transmembrane protein 
HTMPN-60. 


1629 


96 


469 


D38552 


Homo sapiens 


The hal539 protein is related to 
cyclophilin. 


2995 


100 


470 


Y70013 


Homo sapiens 


Human Protease and associated 
protein-7 (PPRG-7). 


3530 


100 


471 


AJ224747 


Homo sapiens 


C-terminal variant of hINADL 
including 2 amino acid exchanges 
and an insertion of 28 amino acids in 
frame. 


7969 


100 


472 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protein. 


1546 


100 


473 


W99665 


Homo sapiens 


Human secreted protein clone 
dul57_12 protem. 


998 


98 


ah a 

474 


X63526 


Homo sapiens 


homologue to elongation factor 1- 
gamma from A.salina 


2273 


99 


475 


X15940 


Homo sapiens 


ribosomal protein L3 1 (AA 1-125) 


644 


100 


476 


M60832 


Homo sapiens 


alpha-2 type VIII collagen 


3581 


99 


477 


AF039697 


Homo sapiens 


antigen NY-CO-3 1 


1213 


97 


4/6 


Ar 156929 


Sus scrofa 


inflammatory response protein 6 


1588 


83 


479 


AF264717 


Homo sapiens 


FYVE domain-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; POL4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGIF protein 


1413 


100 
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507 
508 



AJ293309 " 
U39045 " 



Homo sapiens 



coded for by u elegans cDNA 
yk34bl.5; coded for by C elegans 
cDNA ykl3h!0.5; coded for by C 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2 3; coded for 
by C. elegans cDNA yk46d5.3; 
coded for by C. elegans cDNA 
ykl3fl0.3; coded for by C. elegans 
cDNAyk34bl.3 



Rattus 
norvegicus 



NHP2 protein 



cytoplasmic aynein intermediate 
chain 2B 




801 
3241 



100 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


j ij 




— — - — — 

Escherichia 
coli 


similar to 


1 a on 

1489 


100 


JlO 




Homo sapiens 


: j= : — pr 

zinc finger protein Hsal2 


conn 

5^yu 


100 


j 1 / 




ivius museums 


apopto sis- linked gene 4, deltaC form 




78 


518 


AF019926 


Mus musculus 


protein kinase 


1694 


90 




JVLJ40 1 j 


Homo sapiens 


omega protein 


317 


91 


sin 


Y Uooiz 


Homo sapiens 


88kDa nuclear pore complex protein 


2313 


99 




I UoOlZ 


Homo sapiens 


88kDa nuclear pore complex protein 


1561 


99 


oil 


AiAJyo/DO 


Homo sapiens 


A A CATT1 O 1 r\S"T A A rtn/fl j. ■ \ 

dA59H18,l (KIAA0767 protein) 


2497 


100 






Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 




a Dmnn i **> 
AdU29U12 


Homo sapiens 


Ti^T A A 1 AOn __j • 

KIAA1089 protein 


4933 


100 


525 


AB026893 


Homo sapiens 


vascular cadherin-2 


5962 


100 


526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit 
preprotein 


2639 


100 


530 


U80446 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ykl72e6.3; coded for by C. elegans 
cDNA ykl58f7.3; coded for by C. 
elegans cDNA ykl58f7.5; coded for 
by C. elegans cDNA ykl72e6.5 


420 


39 


531 


S76838 


Mus sp. 


Dbs 


4821 


88 


532 


Z82215 


Homo sapiens 


dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 


9828 


100 


533 


AF245505 


Homo sapiens 


adlican 


277 


31 


534 


AF300612 


Homo sapiens 


N-acetylgalactosamine-4-O- 
sulfotransferase 


993 


59 


535 


AT in r»i o 

AL121928 


Homo sapiens 


bA 18114.3 (pleckstrin and Sec7 
domain protein) 


3333 


99 




a rni/icc 
AJz/1Uj5 


Mus musculus 


iroquois homeobox protein 6 


1724 


76 


537 


AF1 80473 


Homo sapiens 


Not2p 


2267 


100 


53o 


Ar 07 1059 


Mus musculus zinc finger RNA binding protein 


1089 


51 


539 


AF023453 


Homo sapiens 


actin-related protein 3-beta 


2219 


100 


540 


AC003030 


Homo sapiens 


R29828J 


1401 


70 


541 


AC003030 


Homo sapiens 


R29828_l 


2294 


100 


542 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


2152 


100 


543 


AB006135 


Rattus 
norvegicus 


db83 


1238 


98 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


644 


97 


CA C 

545 


Y07595 


Homo sapiens 


transcription factor TFHH 


2373 


100 


CA£ 

546 


at i 'y'i c a r 

AL 133545 


Homo sapiens 


bA386N14.1 (novel protein similar 
to a dual specificity phosphatase) 


964 


99 


547 


X83618 


Homo sapiens 


hydroxymethylglutaryl-CoA 
synthase 


2647 


100 


548 


Ar 134726 


Homo sapiens 


NG37 


4359 


99 




A "DrtO CO C^ 

Ab055356 


Homo sapiens 


neurexin I-alpha protein 


6948 


99 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma- 1 


5215 


99 




A T}f\A1 *£1 A 

A±>043o34 


Homo sapiens 


PAR-6A 


885 


100 


j j j 




Homo sapiens 


partial CDS 




99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 (KIA0093); 
similar to P46934 (PIDrgl 171682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


ax on em al dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 



WO 01/57190 



AL121893 



PCT/USO 1/04098 




Homo sapiens 



bA189K21.5 (novel protein similar 
to retinoblastoma binding protein 
(RBBP9)) 



568 



569 



570 
571 



572 



573 



AF228603 
AF239243 



Homo sapiens 



dJ876B10.2 (novel protein (ortholog 
ofratEXQ84)) 



3713 



AF087695 
AB046381 



AC005551 



Y90290 



574 



575 



576 



577 



578 



579 



580 



581 



582 



W76734 
AL121935 



Y86217 



AL121716 



AL121716 



X92715 



Homo sapiens 



Mus musculus 



pleckstrin 2 
histone deacetylase 7 



1841 



Homo sapiens 



veli3 



3244 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



testis-abundant finger protein 



K26529J2, partial CDS 



989 
1346 



Human peptidase, HPEP-7 protein 
sequence. 



1020 



274 



Human mDia Rho targeting protein. 



bA517H2.3 (t-complex 10 (a murine 
tcp.homolog)) 



712 



853 



Human secreted protein HWHGU54 
SEQIDNQ:132. 



2123 



X54637 



X78817 



AJ251245 



Homo sapiens 
Homo sapiens 



(U202D23.2 (novel protein) 
dJ2Q2D23.2 (novel protein) 
KKAB /C2H2 zmc finger protein" 



6329 



6329 



Homo sapiens 



584 



585 



AF1 13125 
M19529 



AF1 69677 
D87685 



Rattus 
norvegicus 



Homo sapiens 
Sus scrofa 



protein tyrosine ki nase 
pll5 



SECIS binding protein 2 



E-l enzyme 



Homo sapiens 



follistatin A 



3102 



5564 



1148 



3086 



leucine-rich repeat transmembrane 
protein FLRT3 



581 



1906 



3403 



99 



100 



86 



100 



99 



100 



52 



32 



78 



99 



99 



99 



97 
98 



44 



71 



100 



98 



100 



Homo sapiens 



587 
588 



Y00876 



Y99674 



589 



D86973 



Homo sapiens 



similar to human transcription factor 
TFUS (S34159). 



8083 



Homo sapiens 



Human LAPH-1 protein sequence 



Human GTPase associated protein- 



25 



Homo sapiens 



similar to Yeast translation activator 
GCN1 (P1:A48126) 



2110 



2111 



12033 



99 



100 



99 



99 



591 



592 
593 



Homo sapiens 



Y57396 



Homo sapiens 



dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 



1979 



AJ297743 
AF1 64796 



594 



Y41312 



Y41312 



597 



AF2 15703 



Mus musculus 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Human lysoenzyme LYC4 
polypeptide. 



814 



torsinB protein 



NADH:ubiquinone oxidoreductase 
MLRQ subunit homolog 



- . o 

Human secreted protein encoded by 
gene 5 clone HLDRM43. 



Human secreted protein encoded by 
gene 5 clone HLDRM43 



1448 
469 



749 



Homo sapiens | Human neurotransmission-associated 

.— Protein (NTAP) 99 8868. 
Drosophila | K1SMET-L long isoform 



824 



2102 



100 



100 



85 



100 



94 



100 



98 



141 



WO 01/57190 



PCTAJS01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






melanogaster 








Dyo 


ATU /U44 / 


tlOmO Sapiens 


barrier-to-autointegration factor 


zyu 


AA 

90 


Dyy 


AD OZU j 


Plasmodium 

IaJCipaiuijl 


liver stage antigen 


•2 11 

3/z 


22 


600 


X79828 


Mus musculus 


NK10 


202 


53 


OUl 


Ax5UU41Uy 


cnceuiius 
griseus 


pnospnanayisenne syntnase a 


hai 
zzoz 


AO 

92 


&(\i 


T TQ/1QCC 


Mus musculus 


JNUlpl 


Zy\l 


OA 

89 


&m 

OUJ 




Mus musculus 


NUlpl 


OOAA 

2800 


86 


£(\A 

0U4 


ArUU0Z04 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


OOCA 

2850 


100 


aac 


Ar UU0Z04 


Homo sapiens 


recombination and sister chromatid 
cohesion protein homolog 


2530 


100 


OUO 


AozZOU 


: 

Homo sapiens 


n„„p/i r>l 


Z9z9 


100 


607 


X82260 


Homo sapiens 


RanGAPl 


1843 


97 


608 


a "ci ^aaaa 

AF 160909 


Drosophila 
melanogaster 


BcDNA.LD0347 1 


943 


58 


C 1 A 

610 


VI /I OA 1 

X74801 


Homo sapiens 


gamma subunit of CCT chaperonin 


2745 


99 


oil 


AL031427 


Homo sapiens 


JT1 f-rf A 1 1 / 1 _ A.— .i \ 

oil 67A 19.1 (novel protem) 


1608 


100 


012 


Y710/Z 


Homo sapiens 


Human membrane transport protein, 
MTRP-17. 


445 


100 


613 


X16396 


Homo sapiens 


precursor polypeptide (AA -29 to 

O 1 c\ 

315) 


1749 


100 


014 


AkUOOzoI 


Homo sapiens 


unnamed protein product 


1814 


99 


615 


ABO J I J28 


Homo sapiens 


KIAA0556 protein 


5761 


99 


616 


Til QOiCI 

U19361 


Petromyzon 
marinus 


XTC 1 OA 

NF-180 


205 


21 


01 / 


A 17 A/1 ^^^^ 

Ar U4j jjD 


Homo sapiens 


wbscrl 


1 1 AO 

1208 


1 AA 

100 


/Tig 

Olo 


ArU4jJJ j 


Homo sapiens 


wbscrl alternative spliced product 


1 1 1 o 

1318 


100 


oiy 




Felis catus 


ribosomal protein L4 1 


1 io 
Izo 


1 AA 

100 


ozu 


VI "71 AO 

Y 1 / 1 Oy 


Homo sapiens 


A6 related protein 


1 O 1 A 

1819 


100 


OZ1 


I 1ZUOD 


Homo sapiens 


nJNopDo 


1 ACaC 

2956 


AA 

99 


ozz 


Ar 1 / / / JO 


Homo sapiens 


ubiquitin specific protease 1 6 


z99o 


1 AA 
100 


623 


AF3 17425 


Homo sapiens 


GAC-1 


3866 


100 


0z4 


A T ACAOAT 

AL050zy7 


Homo sapiens 


hypothetical protein 


1227 


99 


625 


AC007204 


Homo sapiens 


BC273239 1 


3398 


99 


626 


Z68747 


Homo sapiens 


imogen 38 


2024 


99 


627 


Z68747 


Homo sapiens 


imogen 38 


1958 


97 


628 


Y70229 


Homo sapiens 


Human RNA-associated protein- 10 
(RNAAP-10). 


3424 


99 


629 


AF191492 


Homo sapiens 


nasopharyngeal carcinoma associated 
gene protein-8 


613 


100 


630 


API 19664 


Homo sapiens 


transcriptional regulator protein 


1574 


100 


oil 


A T? 1 1 r\CCA 

Ar 1 19664 


Homo sapiens 


transcriptional regulator protein 
rlCJNur 


1150 


89 


OJZ 


VI 7QAQ 


Homo sapiens 


ganglioside-induced differentiation 
associated protein 1 


1 OOA 

1839 


AO 

98 




ajj /4U 


Homo sapiens 


5-nucleotidase 


1 AT 1 

3012 


1 AA 

100 


634 


AF039688 


Homo sapiens 


antigen NY-CO-3 


931 


100 


635 


AF1 19662 


Homo sapiens 


E46 protein 


2424 


100 


636 


AB007836 


Homo sapiens 


Hic-5 


2544 


100 


Oi / 


A 'E , A*7'70 1 C 

ArU//olo 


Mus musculus 


syntrophin-associated serine- 
threonine protein kinase 


2027 


44 


638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 


150 


26 


639 


AF078844 


Homo sapiens 


hqp0376 protein 


416 


81 



142 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 



640 



641 



642 



643 



644 



646 



648 



650 
651 



ACCESSION 
NUMBER 



U28377 



SPECIES 



AK024442 
U58682 



X57432 



AB002348 



Y96202 



AB029482 



AB009053 



AC002550 
U26592 



Escherichia 
coli 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



OJKFJ239; was ORF_fl 91 and 
ORFjfl 94 before splice 



Rattus rattus 



Homo sapiens 
Homo sapiens 



Mus musculus 



Arabidopsis 
thaliana 



Homo sapiens 



FLJ00Q32 protein 



ribosomal protein S28~ 



ribosomal protein S2 



SMITH- 
WATERMAN 
SCORE 



1198 



1677 



KIAA0350 protein 



DcappaB kinase (IKK) binding 
protein, Y2H56. 



JNK-binding protein JNKBP1 
contains similarity to isoamyl 
acetate-hydrolyzing 
esterase~gene_id:MQB225 



340 



1520 



5186 



1178 



4609 



407 



Unknown gene product 



858 



% 

IDENTITY 



100 



56 



100 



98 



99 



81 



44 



99 



652 



653 



X60155 
X53330 



Homo sapiens 



Homo sapiens 



diabetes mellitus type I autoantigen 



Platynereis 
dumerilii 



zinc finger 41 



253 



H4 protein (AA 1 - 1 03) 



4349 



523 



654 



AC003682 



655 
656 



X80473 
J02649 



657 



AC006014 



658 
659 



X92972 



660 



661 



662 



663 



L35269 



AC003682 



X79204 



X17620 



664 



665 



666 



667 



668 



670 



671 



672 
673 



AB015617 



Homo sapiens 



R27945 2 



Mus musculus 
Rattus 



rabl9 



2558 



norvegicus 



unknown protein 



596 



201 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Z56281 



AJ248283 



Homo sapiens 
Homo sapiens 



similar to RFP transfonning protein; 
similar to P14373 (PID:gl32517) 



1331 



protein phosphatase 6 



zinc ringer protein 



1666 



Fl 8547_1 



2803 



ataxia- 1 



3184 



Z70200 
Z70200 



AF153450 



AF227198 



X99586 



Pyrococcus 
abyssi 



Homo sapiens 
Homo sapiens 



Nm23 protein 
ELKS 



4195 



965 



interferon regulatory factor 3 



LACTOYLGLUTATHIONE 
LYASE (EC 4.4.1.5) 
METHYLGLYOXALASE) 
(ALDOKETOMUTASE) 
(GLYOXALASEI). 



Manduca sexta 



Homo sapiens 



Z61589 cdl 



AJ132702 
AF204159 



Homo sapiens 



U5 snRNP-specific 200kD protein 



1501 



2331 



254 



U5 snRNP-specific 200kD protein 



juvenile hormone esterase binding 
protein 



Homo sapiens 



Mus musculus 
Homo sapiens 



CrkRS 
SMT3C protein 



17-AUG-1998 DNA encoding a 
human OC-2 protein. 



ATFa-associated factor 



8819 



8589 



225 



7231 



441 



2593 



3240 



66 



100 



100 



100 



56 



95 



99 



100 



99 



96 



99 



99 



80 



100 



40 



99 



97 



32 



99 



87 



100 



potassium large conductance 
calcium-activated channel beta 3 a 
subunit 



1486 



100 



G02061 



Homo sapiens 



675 



G01246 



Human secreted protein, SEQ ID 
NO: 6142. 



558 



Homo sapiens 



Human secreted protein, SEQ ID 
NO: 5327. 



141 



99 



77 



AB016839 



677 



D86970 



Homo sapiens 



mobl 



Homo sapiens 



similar to myosin heavy chain: 
Containing ATP/GTP-binding site 
motif A(P-loop) 



419 



161 



42 



28 



678 



U83115 
AF203687 



Homo sapiens 
Homo sapiens 



non-lens beta gamma-cry stallin like 
protein 



8569 



99 



prolactin regulatory element-binding 
protein 



2181 



100 



143 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


I DESCRIPTION 


i SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




AVLz /OoO 


Mus musculus 


ultra-high sulphur keratin 


650 


58 


Oo 1 


T TA/tO££ 

uu4yoo 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


97 


OoZ 


AT I lyOOJ 


Homo sapiens 


: — : 

G -protein gamma- 12 su burnt 


300 


100 


OOj 


UUJ / J J 


xiomo sapiens 


Human secreted protein, SEQ ID 

MO- 7R14 


34z 


100 




Yfi7fiQQ 
yvu / xjyy 




ljjwjl anugen 


ion 

zy / . 


1 f\f\ 
100 


685 


AF022789 


Homo sapiens 


ubiquitin hydrolyzing enzyme I 


1892 


100 


OOD 


A Trim nnfi 


Mils musculus 


HMegjz protem 


935 


96 


oo / 


wm ^ 1 £ 

W \JJ J I o 


Homo sapiens 


Prostaglandin DP receptor. 


1 OZA 

loo4 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


£co 
Ooy 


a 17 1 z&zzn 
Ar lOOOO/ 


Homo sapiens 


stomatin related protein 


2036 


100 


oyu 


oujyou 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 


100 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


zm 
o9z 


A T no 1 1 1 c 

AL031115 


Homo sapiens 


ZXDA, ZXDB (zinc Finger X-lmked 
protein) 


4298 


100 


693 


T Af\A1f\ 

L40410 


Homo sapiens 


thyroid receptor interactor 


806 


100 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-like; similar to P22059 
(PID:gl29308) 


2533 


99 




Ar 169411 


Rattus 
norvegicus 


PAPIN 


4144 


52 


696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 

4. 


2144 


100 


697 


AF271994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


100 


698 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


699 


AL133506 


Unknown 


/prediction=(method: ,r "genscan ,MI 3 
version:""1.0"", score:"" 109.1 3" M ); 
/prediction==(rnethod: 


825 


48 


700 


Y96870 


Homo sapiens 


Human goose-type lysozyme 
(GOLY). 


1032 


100 


701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


100 


/UZ 




-— - : 

Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


937 


95 


703 


AJZ4Z50Z 


— — : 

Homo sapiens 


calpain 


3756 


100 


704 


oOZOZH- 


Homo sapiens 


unknown 


185 


100 


70S 


/VTUUOUO 1 


Homo sapiens 


skin- specific protein 


652 


100 


706 


Y16793 


Homo sapiens 


keratin, type I 


2232 


100 


/U / 


i 44^80 


Homo sapiens 


Human epidermal protein-2. 


455 


69 


7fifi 
/Uo 


A Pi 1 ^Ofl 
AT 1 1 JZZU 


Homo sapiens 


Mb 1 rl)40 


686 


100 


7no 


j 44y o j 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


/ IVJ 


I lOJoZ 


Homo sapiens 




1874 


100 


71 1 


iOo/O 


Homo sapiens 


Amino acid sequence of a human 
pnospnoryianon enector rxior- /. 


2407 


100 


71? 




numo bdpiens 


H(+)-transporting ATP synthase 


zuy 


1 AA 

100 


71 3 

/ ID 




Mus musculus 


un a oinaing protem uc,oK 1 


1467 


79 


714 


X52563 


Bos taurus 


permability increasing protein 


383 


29 


/ 10 


A TO^T^QQ 
AJz/ / /Jy 


Homo sapiens 


KriD 1 1 b 1 alpha protem 


480 


98 


716 


AL135791 


Homo sapiens 


bA 1 62G 1 0.3 (zinc finger protein) 


401 


98 


ill 


Arz234oo 


Homo sapiens 


HT015 protem 


1311 


97 


719 


AFn73Jtt 

AI 1 I / J OJ 


nUulO bapicUb 


r\1 Q^^Afo I nrntAin 1?* DDI'S 

placental proiem i j, rrij 




1 AA 


720 


Z98743 


Homo sapiens 


dJ181C9.2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


GO 1436 


Homo sapiens 


Human secreted protein, SEQ ID 


418 


96 



144 



WO 01/57190 



PCT/US01/04098 



SEQ 
ID 
NO: 

723 


ACCESSION 
NUMBER 

AF282919 


SPECIES 

Mus musculus 


DESCRIPTION 

NO: 5517. 
Zfp228 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


724 
725 


AB023191 
AL031778 


Homo sapiens 
Homo sapiens 


KIAA0974 protein 
dJ34B21.1 (novel BZRP 
(benzodiazaDine recentor (nerinheraH 
(MBR, PBR, PBKS, EBP, 
Isoquinoline-bindins protein)} LIKE 
protein) 


349 
920 


49 
100 
100 


726 


AL021939 


Homo sapiens 


dJ352A20.2 (aldehyde 
dehydrogenase family protein) 


1764 


1 f\f\ 


727 


AF182426 


Rattus 
norvegicus 


arylacetamide deacetylase 


701 
iy i 


42 


728 


Y08565 


Homo sapiens 


UDP-Ga3NAc:polypeptide N- 
acetylgalactosaminyltransferase 


jjj i 


yy 


729 


AF155135 


Homo sapiens 


novel retinal pigment epithelial cell 
protein 


1652 


99 


730 


AL078606 


Arabidopsis 
thaliana 


putative protein 


: 277 


55 


731 

732 


Y73352 
AF178432 


Homo sapiens 
Homo sapiens 


HTRM clone 1732368 protein 

sequence. 

SH3 protein 


i7?n 


i r\f\ 


733 


Y 17832 


Human 
endogenous 
retrovirus K 


env protein 


J jUZ 

223 


1 f\f\ 

100 
34 


734 


Y28859 


Homo sapiens 


Human mesoderm induction early 
response protein ER1 . 




98 


735 


U09355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B gamma 
subunit 


73S7 


yy 


736 


VQ4Q77 


Homo sapiens 


Human secreted protein clone pv6 1 
protein sequence SEQ ID NO:50. 


724 


99 


737 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


738 


Apl 177fiA 

/vr j. izzuu 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


739 


100 


739 
740 


AF1 12200 
AF302154 


Homo sapiens 
Homo sapiens 


NADH-oxidoreductase B18 subunit 
SPG protein 


61 1 


QQ 
OO 


741 
742 


R7^£R7 

DZJOOi 

L27479 


Homo sapiens 
Homo sapiens 


Human secreted protein sequence 
encoded by gene 17 SEQ ID NO:70. 
X123 


6556 
1410 


100 
99 


743 


L27479 


Homo sapiens 


X123 


1237 

1706 


99 

07 

y / 


744 
745 


Y66745 
AJ001019 


Homo sapiens 
Homo sapiens • 


Membrane-bound protein PRO 1 1 86 
ring ringer protein 


JOO 


QQ 

yy 


746 


X68453 


Sus scrofa 


tubulin-tyrosine ligase 


1292 
188? 


QQ 

yy 

Q/f 

y4 


747 


Y57897 


Homo sapiens 


Human transmembrane protein 
HTMPN-21. 


1173 


inn 


748 


AF151069 


Homo sapiens 


HSPC235 


1694 


Q6 
y\) 


749 


AF1 82404 


Homo sapiens 


mitochondrial uncoupling protein 1 


1674 


100 


750 


AL121993 


Homo sapiens 


dJ776P7.1 (Novel protein) 


?son 


QQ 

yy 


751 


AF149825 


Homo sapiens 


PACSIN3 




i nn 

1UU 


752 


AL008635 


Homo sapiens 


dJ510H16.2 (high-mobility group 
protein 2-like 1) 


3026 


99 


/jo 


V^7Q1/1 
l3/7l4 


Homo sapiens 


Human transmembrane protein 
HTMPN-38. 


' 1124 


100 


754 


AF285109 


Homo sapiens 


septin 3 isoform B 


1766 


100 


755 

/JO 


AF004161 

7*1 Q<C< 


Oryctolagus 
cuniculus 
Somo sapiens 


peroxisomal Ca-dependent solute 
carrier 

ftrombospondin-4 


2371 


95 


757 


AP001745 


Homo sapiens 


similar to zinc ringer 5 protein 


4239 
1857 


100 
100 


758 
759 
760 


AF190664 
AF090326 ] 
AL096677 ] 


Mus musculus 
Mus musculus 
hLomo sapiens < 


LMBR2 

AE- 1 binding protein AEBP2 ' 
1F322G13.3 (novel protein similar to 


555 
1540 
999 


72 
97 
94 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










-~ : _ — 

bovine and mouse beta-soluble NSF 












attachment protein (SNAP-beta) ) 






7£1 
/Ox 


a pnnmm 
ALAJUjUU/ 


— ; 

Homo sapiens 


Unknown gene product (partial) 


o4y 


96 


t^a 


U 66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamme receptor activity 


1152 


100 








moditying protein SbQ ID NO: 1 . 






/Oj 


TTQO 1 /TA 


uaenornabaitis 


similar to molybdoterin biosynthesis 


1 OA/ 

1204 


65 






elegans 


MObb protems 






/DO 


Abl 18506 


Homo sapiens 


dJ59 1 C20.3 . 1 (novel DnaJ domain 


1 AA 1 

1091 


100 








protein, similar to mouse and bovine 












cysteine string protein) 






7<7 

/o / 


AMJz4oy3 


Homo sapiens 


unnamed protein product 


3767 


100 


7£Q 
/DO 


71 i CI O 

Z.1 IMo 


Homo sapiens 


nistidyl-tRNA synthetase 


A C OA 

2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 


25529 


100 








-19 to 4525) 






TTA 

770 


AC009360 


Arabidopsis 


Contains 3 PF|00400 WD40, G-beta 


333 


33 






thaliana 


repeat domains. 






771 
III 


AbUJ /Oo5 


Mus musculus 


LANP-like protein 


1246 


91 


772 


AT 1 y i c~in 

AL161578 


Arabidopsis 


putative protein 


335 


46 






thaliana 








773 


AT "\ £.1 f to 

AL161578 


Arabidopsis 


putative protein 


333 


47 






thaliana 








'7*7/1 

774 


AY008271 


Homo sapiens 


hehcase SMARCAD1 


5264 


99 


775 


Y21591 


Homo sapiens 


Human secreted protein (clone 


1127 


96 








CC332-33). 






776 


W88853 


Homo sapiens 


TV 1 > • 1 f* j lit 

Polypeptide fragment encoded by 


752 


100 








gene 89. 






111 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 


752 


100 








gene 89, 






//o 


TX/OOOO 

W88853 


Homo sapiens 


Polypeptide fragment encoded by 


752 


100 








gene 89. 






TTG 
/ /V 


ATI C\CA O 1 

At iyo4ol 


Homo sapiens 


RING ringer protem; FXY2 


3644 


100 


780 


A T AO C A / *»*"7 

AL035427 


Homo sapiens 


dJ769N13.1 (KIAA0443 protein.) 


1609 


54 


781 


AB026 1 87 


Homo sapiens 


protocadherin-Xa 


5244 


100 


782 


"DA A A CO 

1324458 


Homo sapiens 


Human secreted protein sequence 


1002 


100 








encoded by gene 22 bEQ ID NO: 83. 






TOO 

/o3 


A DnTTI OA 

A13027289 


Homo sapiens 


cyclin-E binding protein 1 


5421 


100 


no a 

784 


/"•AT A 1 

G02916 


Homo sapiens 


Human secreted protein, SEQ ID 


627 


100 








NO: 6997. 






7C< 


AJz45ozz 


Homo sapiens 


type I transmembrane receptor 


4560 


100 


7QA 

/oo 


AJz4DozU 


Homo sapiens 


type I transmembrane receptor 


A £"T A 

4624 


100 


7C7 
/o / 


V/1 OA/17 

Z>4oU4Z 


Homo sapiens 


(jrl-ancnored protem pi 37 


3340 


99 


7£R 
/oo 


AT mi 7Q7 

AIAJJ 1 /oz 


Homo sapiens 


cu/Uor5.1 (rUlAIlvbnovel 


ATT A 

2739 


100 








oonagen aipna i Ldtvii protem ) 






7£Q 
toy 


A TIO 1T/IC 
nJ i J IZHD 


xiomo sapiens 


oecz4i5 protem 


oouz 


10U 


700 


API 07701 
AT 1U /ZU.} 


UAmA fiAr>i a«% rt 

xiomo sapiens 


ataxin 2-binding protein 


T A AO 

zUUo 


1 AA 
lUU 


701 

/y i 


i 1407U 


Homo sapiens 


procollagen alpha 2(V) 


600 


•5 A 

34 


707 
/7Z 


AT 01 1 O^r 


Homo sapiens 


A TO CTJ7fl T /nnita! nTv%4-A*-n\ 

ojzorizu.z (novel protem) 


1 A^T 

lzo7 


1 AA 
100 


70*5 


V"2£1 0/1 
I JO 1 74 


757 
/o / 


Human secreted protein 


A AC 1 

2051 


A A 

99 


/y4 


A "0/170 1 7T 

AbUzoIz/ 


Homo sapiens 


mannosyltransferase 


2138 


96 


TQC 

/95 


A PAATIIO 

AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 


putative protein 


436 


47 






thaliana 








707 
/y / 


a r*c\(\A coo 


Homo sapiens 


T>07 1 QA "3 


OA 1 

891 


A 1 

91 


798 


AB037830 


Homo sapiens 


KIAA1409 protein 


7532 


100 


799 


X53193 


Homo sapiens 


5 1 half of the product is homologies 


2232 


100 








to Bacillus subtiis SAICAR 












synthetase, 3 ! half corresponds to the 












catalytic subunit of AIR carboxylase 
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aEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


INSCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


800 


Y99350 


Homo sapiens 


Human PRO 1378 (UNQ715) amino 
acid sequence SEQ EDNO:33. 


1343 


100 


801 


AB042636 


Homo sapiens 


junctophilin type3 


1225 


A1 


802 


AB029324 


Ratals 
norvegicus 


TIP120-family protein T1P120B 


3916 


90 


803 


AB029324 


Rattus 
norvegicus 


TIP120-family protein HP120B 


4961 


90 


804 


AF251040 


Homo sapiens 


putative nuclear protein 


2119 


100 


805 


AB033281 


Homo sapiens 


F-box and WD-repeats protein beta- 
TRCP2 isofonn C 


7R7Q 




806 


U87305 


Rattus 
norvegicus 


transmembrane receptor LTNC5H1 




y\) 


807 


AF 118889 


Rattus 
norvegicus 


b-tomosyn isoform 


i ^ i s s 


97 


808 


AF226993 


Rattus 
norvegicus 


selective LIM binding factor 


8703 


yj 


809 


W19919 


Homo sapiens 


Human Ksr- 1 (kinase suppressor of 
Ras). 


3Q3Q 
jyjy 


yy 


810 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


I 546 


1UU 


811 


AC002542 


Homo sapiens 


similar to C. elegans F11A1 0.5; 80% 
similarity to Z68297 (PIDrgl 130619) 


2204 


1 nn 


812 


U83246 


Homo sapiens 


copine I 


606 


52 


813 


AF242552 


Gallus gallus 


retinovin 


045 




814 


X52332 


Homo sapiens 


zinc finger protein 10 




yj 


815 


X52332 


Homo sapiens 


zinc finger protein 10 


2423 


99 


816 


Y09631 


Homo sapiens 


PIBF1 protein 


2935 


99 


817 


X71997 


Rattus 
norvegicus 


myosin I 


JOOj 


9o 


818 


AY004877 


Mus musculus 


cytoplasmic dynein heavy chain . ' 


11105 


98 


819 


Y27196 


Homo sapiens 


Human cyclic nucleotide 
phosphodiester PDE8B(E) amino 
acid sequence. 


j /y\J 


1 A A 
100 


820 


AF081947 


Mus musculus 


tektin 


1 1 


Q 1 
51 


821 


AL035106 


Homo sapiens 


dJ998Cll.l (continues in 
Em:AL445192 as bA269H4.1) 


871 


1UU 


822 


AF022795 


Homo sapiens 


TGF beta receptor associated protein- 
1 


385 




823 


AFO 15770 


Mus musculus 


radical fringe 


1422 


OZ 


824 


U82695 


Homo sapiens 


expressed-Xq28STS protein 


1444 
i Mil 


OO 

77 


825 


X77371 


Mesocricetus 
auratus 


COR1 


641 


7R 


826 


AB014576 


Homo sapiens 


KIAA0676 protein 1 


296 


70 
/y 


827 


AL049733 


Homo sapiens 


dJ875H3.1 (APKI antigen) 


1584 


77 


828 


AF222980 


Homo sapiens 


disrupted in Schizophrenia 1 protein 


441-8 


100 


829 


Z31560 


Homo sapiens 


sox-2 






830 


AF295773 


Homo sapiens 


ral guanine nucleotide dissociation 
stimulator 


4717 


oo 
yy 


831 


AB041926 


Homo sapiens 


GCK family kinase MINK-2 


6866 


inn 


832 


L04948 


Saccharomyce 
s cerevisiae 


mitochondrial transporter protein 


J J o , 


35 




a mom i o 


Mus musculus 


Fish protein 


704 


94 


834 


Z34289 


Homo sapiens 


nucleolar phosphoprotein pl30 


3455 


99 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 


AF230877 


Homo sapiens 


MIP-T3 


2945 


99 


"837 
838 


X58288 
X56958 


Homo sapiens 
Homo sapiens 


protehi-tyrosine phosphatase 


7734 


99 


839 


AC024791 


Caenorhabditis 
elegans 


anicynn (brank-2) 

contains similarity to beta-lactamases 


9631 
370 


100. 
24 
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SEQ 
ID 
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ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


840 


D83197 


Homo sapiens 


ankyrin repeat protein 


802 


99 


841 


AF053711 


Serinus 
can aria 


neurofilament medium subunit 


192 


31 


842 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal 
protein LIO encoded by GenBank 
Accession Number L25899 


990 


96 


843 


U76343 


Homo sapiens 


GABA transport protein 


2992 


98 


844 


Y13645 


Homo sapiens 


uroplakin II 


897 


100 


845 


D21064 


Homo sapiens 


• •1.1. , i i i*i 
similar to rat general mitochondrial 

matrix processing protease mRNA 

/T) A 'I "n /f~DTD\ * 

{KA I JVUrr ). 


2710 


99 


o4o 


A 171 tiOCT) 


Homo sapiens 


Niemann-JPick Li protein, fNJFC3 


704/ 


100 


QAH 

847 


At192j22 


Homo sapiens 


Niemann-rick CJ protein; NrL3 


5472 


100 


848 


X60489 


Homo sapiens 


elongation factor- 1 -beta 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239_1 


2277 


67 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


100 


851 


AL121583 


Homo sapiens 


bA358N2.1 (novel protein) 


353 


61 


852 


Z48475 


Homo sapiens 


glucokinase regulator 


3155 


99 


853 


Z83844 


Homo sapiens 


dJ37E16.2 (SH3-domain binding 
protein 1) 


1884 


98 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 


855 


AF062741 


Rattus 
norvegicus 


pyruvate dehydrogenase phosphatase 
isoenzyme 2 


447 


80 


856 


Y11411 


Homo sapiens 


pristanoyl-CoA oxidase 


3595 


98 


857 


M97188 


Strongylocentr 
otus 

purpuratus 


tektin Al 


290 


46 












858 


AB001105 


Homo sapiens 


hippocalcin-like protein 4 


995 


100 


859 


AF 164791 


Homo sapiens 


putative 38.3 kDa protein 


1795 


100 


860 


A T1A Aft 1 1 m 

AF298117 


Homo sapiens 


homeobox protein OTX2 


1477 


93 


861 


AF015264 


Rattus 
norvegicus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAB30 /74 


1284 


100 


863 


M12140 


Homo sapiens 


envelope protein 


202 


81 


864 


AF161459 


Homo sapiens 


HSPC109 


815 


98 


865 


AL1 09983 


Homo sapiens 


dJ718Pl 1.1.1 (novel class II 
aminotransferase similar to serine 
palmotyltransferase (isoform 1)) 


444 


300 


866 


M77183 


Rattus 
norvegicus 


alpha- i -macroglobulin 


227 


45 


oo7 


AJK272663 


Homo sapiens 


gephyrin 


3785 


100 


868 


X75285 


Mus musculus 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


99 ! 


OTA 

870 


AJ297743 


Mus musculus 


torsinB protein 


169 


43 ! 


871 


AJ278313 


Homo sapiens 


phospholipase C-beta- 1 a 


6258 


99 


872 


AF073344 


Homo sapiens 


ubiquitin-specific protease 3 


256 


43 


873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
protein 10 (C YSKP- 10). 


535 


100 


874 


AJ000414 


Homo sapiens 


Cdc42-interacting protein 4 


1136 


53 


875 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain 
enzyme APOLLON 


627 


100 


876 


Y48586 


Homo sapiens 


Human breast tumour-associated 
protein 47. 


2537 


98 


877 


in oi mo 

AF182198 


Homo sapiens 


intersectin 2 long isoform 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPRN-terminal sequence. 


210 


23 
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SEQ 
ID 

NO: 



881 



882 



883 



884 



ACCESSION 
NUMBER 



AL021068 



AC005498 



AF165518 



D222I1 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



DESCRIPTION 



dJ206D15.3 



R31665 2 



MAGOH isoform 



protein tyrosine phosphatase (PTP- 
BAS, type 3) 



SMITH- 
WATERMAN 
SCORE 



2615 
318 



182 



368 



% 

IDENTITY 



99 



82 



94 



43 



Homo sapiens 



886 



X52836 



nuclear respiratory factor-2 subunit 
betal 



869 



887 



X51466 



Homo sapiens 



888 



AB039903 



Homo sapiens 



Homo sapiens 



tryptophan hydroxylase (AA 1 - 444) 
elongation factor 2 ~~~ 



2320 



889 



X51760 



AJ243396 



Homo sapiens 



interferon-responsive finger protein 1 
long form 



4460 



1096 



Homo 



zinc finger protein (583 AA) 



sapiens 



voltage-gated sodium channel beta-3 
subunit 



3130 



1024 



62 



98 



100 



98 



100 



100 



Homo sapiens 



892 



AB020598 



Homo sapiens 



Fragment of human secreted protein 
encoded by gene 4. 



391 



peptide transporter 3 



3017 



100 



100 



894 



Y66648 



Homo sapiens 



895 



896 



897 



898 



899 



900 



901 
902 



A29218_cd 
1 



Homo sapiens 



Membrane-bound protein PROl 120 



Homo sapiens 



Membrane-bound protein PROl 120" 



4722 



AJ000332 



X98259 



Homo sapiens 



19-NOV-1998 DNA encoding G- 
protein coupled 7 TM receptor with 
AXOR1 5 activity. 



3606 



2178 



Glucosidase II 



X57110 



Homo sapiens 



X63652 



X85 134 



903 



LI 1672 



Y85565 



X54871 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



M-phase phosphoprotein 8 
c-cbl protein " 



inter-alpha-trypsin inhibitor heavy 
chain ITIH1 



Homo sapiens 



Homo sapiens 



RB protein binding protein 
zinc finger protein 



Human homologue of UNC-53 (Hs- 
UNC-53/2) sequence. 



ras related protein Rab5b 



5063 



1085 



4849 



3376 



2816 



2047 



369 



1094 



99 



96 



100 



99 



100 



99 



98 



99 



58 



83 



100 



905 



AL035295 



Homo sapiens 



Homo sapiens 



plakophilin 3 



hypothetical protein 



4065 



959 



100 



99 



906 



907 



AF208536 



Homo sapiens 



908 
909 



U79240 



Homo sapiens 



910 



U79240 
AJ132545 



Homo sapiens 



diaphanous 1 

nucleotide binding protein; NBP 



801 



Homo sapiens 



AJ132545 



912 



AL 12 1733 



913 



Y67579 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Homo sapiens 



serine/threonine protein kinase 



1372 



serine/threonine protein kinase 



protein kinase 



protein kinase 



hypothetical protein 



Human death inducer-obliterator 1 
(DIP- 1) polypeptide. 



2365 



2386 



2921 



1637 



1344 



1586 



35 



100 



98 



99 



100 



99 



99 



100 



Homo sapiens 



Human giant larvae homologue 



5317 



99 



916 



M94362 



917 
918 



AJ011654 



AJ131899 



Homo sapiens 



Homo sapiens 



Homo sapiens 



Rattus 
norvegicus 



Human giant larvae homologue 



lamin B2 



triple LIM domain protein 



proline rich synapse associated 
protein 1 



3495 



2357 



3432 



5776 



96 



93 



100 



919 
920 



U95822 
Y11588 



Homo sapiens 



Homo sapiens 
Homo sapiens 



putative transmembrane GTPase 



putative transmembrane GTPase 



1816 



1237 



100 



100 



922 



X84195 



923 



U72882 



924 1 A£00"066T 
AF126245 



925 



Homo sapiens 
Homo sapiens 



apoptosis specific protein . 



acylphosphatase 



1492 



Homo sapiens 



interferon-induced leucine zipper 
protein 



510 



1409 



Homo sapiens 



hADV36Sl 



acyl-Coenzyme A dehydrogenase-8 
precursor 



573 



2162 



100 



100 



99 



100 



100 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




AJbuuiyoo 


■ — — — ; 

Deinococcus 
radiodurans 


— ; - 

hypothetical protein 


147 


27 


927 


W81576 


Homo sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 42 SEQ ID 
NO:165. 


1401 


100 


Q1 1 


Yylo44 


— — : 

Homo sapiens 


Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:317. 


1243 


100 


932 


D90279 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


934 


AF147790 


• Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08151 P28185 Q0111 1 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
match: P10949 PI 1023 Q16948 
Q20337; match: Q25389 P25228 
P20336 P05713; match: P35276 
Q08147 P 17609 P22I28; match: 
Q15771 P36410P35291; GTP- 
binding 


726 


94 


936 


AJ3041533 


Homo sapiens 


sperm antigen 


1054 


38 


937 


X91906 


Homo sapiens 


voltage-gated chloride ion channel 


3914 


100 


938 


AB03248 1 


Homo sapiens 


homeobox transcription factor 


1744 


100 


939 


AF1 1 1 106 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 


99 


940 


Y 1 7999 


Homo sapiens 


DyrklB protein kinase 


3331 


99 


941 


AF305872 


Homo sapiens 


thyroglobulin 


455 


92 


942 


AF263462 


Homo sapiens 


cingulin 


5939 


99 


943 


AK024442 


Homo sapiens 


FLJ00032 protein 


1616 


61 


944 


Y35911 


Homo sapiens 


Extended human secreted protein 
sequence, SEQ ID NO. 160. 


262 


35 


945 


AB015320 


Homo sapiens 


sigma IB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


r-w \f r & i\ ^ 

ZK550.2 


229 


35 


947 


D84223 


Homo sapiens 


1 1 j.n\T A jl j 

leucyl tRNA synthetase 


6207 


99 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens 


unnamed protein product 


1659 


100 


950 


AL021578 


Homo sapiens 


(1J453C12.6.1 (uncharactenzed 
hypothalamus protem (isoform 1)) 


257 


42 


951 


AB032435 


Homo sapiens 


differentiation-associated Na- 
dependent inorganic phosphate 
cotransporter 


3063 


99 


952 


AF 110532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


yjj 


Aojjo / 


Mus musculus 


1A13 protein 


1420 


f ft 

59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 * 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-like 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

YX7 A TTD Vf A M 

SCORE 


% 

IDENTITY 


957 


U68535 


Mus musculus 


aido-keto reductase 


451 


73 


958 


AC007067 


Arabidopsis 
thaliana 


T10O24.10 


1594 


57 


959 


U72194 


Mus musculus 


muskelin 


3947 


99 


960 


AE003661 


Drosophila 
melanogaster 


CG15168 gene product 


277 


54 


961 


X80332 


Mus musculus 


rab20 


983 


82 


962 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


963 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


964 


L32602 


Rattus 
norvegicus 


homeodomain 159..341 


1821 


96 


965 


Z97832 


Homo sapiens 


(3J329A5.3 (KIAA06460 protein) 


3581 


99 


966 


W88995 


Homo sapiens 


Polypeptide fragment encoded by 
gene 146. 


176 


39 


967 


U12465 


Homo sapiens 


ribosomal protein L35 


604 


100 


968 


AF151803 


Homo sapiens 


CGI-45 protein 


1101 


78 


969 


W74865 


Homo sapiens 


Human secreted protein encoded by 
gene 137 clone HMWIF35. 


1348 


98 


970 


L21936 


Homo sapiens 


succinate dehydrogenase flavoprotein 
subunit 


703 




971 


AJ133521 


Drosophila 
buzzatii 


protease, reverse transcriptase, 
ribonuclease H, integrase 


194 


23 


972. 


AC006017 


Homo sapiens 


N-acetylgalactosammyltransferase; 
similar to Q10473 (PIDrgl 709559) 


3271 


100 


973 


Z81317 


Schizosacchar 
omyces pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


974 


M17885 


Homo sapiens 


acidic ribosomal phosphoprotein (P0) 


792 


100 


975 


U22829 


Mus musculus 


P2Y purinoceptor 


399 


40 


976 


AL 132772 


Homo sapiens 


dJ1013A22.1 (hepatic nuclear factor 
4, alpha) 


2466 


99 


977 


AC003973 


Homo sapiens 


ZNF91L 


1550 


43 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 
EC 6.3.4.3) 


2824 


63 


979 


AF136715 


Homo sapiens 


taxol resistant associated protein 


217 


76 


980 


AF136715 


Homo sapiens 


taxol resistant associated protein 


306 


95 


981 


Z92822 


Caenorhabditis 
elegans 


ZK520.1 


1109 


44 


982 


AJ295149 


Homo sapiens 


putative dipeptidase 


1564 


99 


983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosine Ligase LIKE) 


1492 


100 


984 


AL161501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLE 3 



SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


1 BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 4.259e-14 97-120 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 10.97 1.000e-40 74- 
119 BL00298E 27.30 L000e-40 
321-376 BL00298F 1121 l.OOOe- 
40 409-464 BL00298H 20.50 
1.000e-40 553-607 BL00298C 
16.40 2J286e-40 186-230 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1.290e-39 134- 
181 BL00298G 24.57 5.345e-39 
465-520 BL002981 30.07 7.818e- 
34 661-715 BL00298D 17.97 
o.Z2t>e-33 242-Z52 


4 


PR00237 


RHODOPSTN-LIKE GPCR 

p| TT*T~ , T> A A /TTT V/ OT/^XT A *T"T TO T"? 

S U PERF AMIL Y SIGNATURE 


PR00237A 11.48 4.316e-13 57-82 


5 


PD02454 


! ! ! ! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 11.61 4.309e-17 75- 
103 


6 


DM00864 


T— » /—\ *y-i T Y*r*"ri T-V/"V» M A TV T 

EGF-LIKE DOMAIN. 


DM00864A 15.21 7.429e-09 98- ! 
119 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-ll 29-54 
PR00237D 8.94 7.000e-09 138- 
160 PR00237B 13.50 8.250e-09 
61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-l 5 272-289 


10 


BL00139 


Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 


BL00139D 9.24 4.400e-ll 391- 
408 BL00139A 10.29 7.51 le-09 
67-77 


12 


BL01113 


Clq domain proteins. 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.18 4.857e-ll 
757-777 BL01113D7.47 2.161e- 
10 790-800 


13 


BL01113 


Clq domain proteins. 


BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-ll 
667-687 BL01113D 7.47 2.161e- 
10 700-710 


14 


BL00594 


Aromatic amino acids permeases 
proteins. 


BL00594A 16.75 6.531e-10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 
728 


16 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e- 1 8 3 1 0- 
330 PR00625B 13.48 3.939e-15 
340-361 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 
2.180e-17 318-340 PR00741C 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 
374 PR00741A924 3.596e-13 
89-105 PR00741E 13.39 3.535e- t 
12 215-232 - 


22 


BL00107 


Protein kinases A TP-binding region 
proteins. 


BL00107A 18.39 3.647e-20 117- 
148 BL00107B 13.31 1.000e-16 
182-198 


23 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 
157 


24 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.600e-23 126- 

1 J / 


27 


BL00239 


Receptor tyrosine kinase class II proteins. 


BL00239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins. 


BL00018 7.41 6.400e-10 717-730 


30 


BL01113 


CI q domain proteins. 


BL01 1 13A 17.99 9.308e-09 54-81 


33 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01168L 9.47 L667e-09401- 
416 


34 


PD01168 


SYNTHETASE LIGASE PROTEIN 
ALANYL. 


PD01 168L 9.47 1 .667e-09 41 1- "" 
426 


36 


PR00426 


C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 


PR00426D 10.59 3.618e-12 110- 
122 


37 


PF00791 


Domain present in ZO- 1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 2.049e-10 1080- 
1135 


38 


BL00350 


MADS-box domain proteins. 


BL00350 20.79 1.000e-40 1-55 


40 


BL00123 


Alkaline phosphatase proteins. 


BL00123B 19.31 1.000e-40 90- 
133 BL00123C 24.61 1.000e-40 
145-195 BLO0 123E 22.25 LOOOe- 
40 PIT 00193H 9£ 01 

1.000e-40 438-488 BL00123F 
19.03 8.714e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 1.000e-17216- 
229 


44 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-13 374-387 
T>nri(\(\(\(\ 1 3 Q9 (\ onn© i ^ /i^c Am 

PD00066 13.92 2.714e-12 234-247 

PDOOnA£ 1 3 09 1 1i43o 17 Aid A AO. 

PD00066 13.92 8.714e-12 514-527 
PD00066 13.92 3.739e-l 1 402-415 
PD00066 1 3 99 9 03Rp-10 31 R-331 


45 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973 A 21.17 2.946e- 10 180- 
217 


47 


BL00649 


G-protein coupled receptors family 2 
proteins. 


BL00649C 17.82 1.682e-10 475- 
501 BL00649B 20.68 7.387e-09 
417-463 


50 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8. 200e- 16 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 l.OOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 

Pr}ftftfV»£ 13 09 9 BOO© 1/1 7/1 Q 7/C7 

PD00066 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e-14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 13.92 4.000e-13 389-402 
PD00066 13 99 6 S71<*-19 473 -4 


51 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 l.OOOe-40 417- 
464 BL00226B 23 86 3 34Re-3^ 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
1.857e-15 151-166 


52 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 5.648e-09 133- 
149' 


53 


BL00232 


Cadherins extracellular repeat proteins 
domain nroteins 


BL00232B 32.79 1.000e-40 143- 

101 T3T 00939 A 97 77 7 3 SO* 7B 

49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.314e-l 1 367-415 BL00232C 
10.65 9.308e-10 470-488 


54 


BL00303 


S-100/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL003O3A 21.77 1.000e-21 

O** lift 

82-1 19 


58 


PR00378 


INOSITOL PHOSPHATASE 

OlPVT A T*T TOT? 

SIGNATURE 


PR00378D 16.86 L000e-15 242- 

2ol PRUU37ob Ij.oU 9.250e-l3 
i no no 

iuy-i2y 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 1323 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.514e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237E 13.03 5.091e-13 188- 
212 PR00237G 19.63 7.207e- 13 
268-295 PR00237A 11.48 4.375e- 
11 24-49 PR00237C 15.69 
3.057e-10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e-10 230- 
255 PR00237B 13.50 9.438e-10 
57-79 


70 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.938e-28 31-70 


71 


PR00830 


ENDOPEPTIDASE LA (LON) SERINE 
PROTEASE (SI 6) SIGNATURE 


PR00830A8.41 8.759e-12 348- 
368 


72 


BL00120 


Lipases, serine proteins. 


BL00120B 11.37 2.149e-10 148- 
163 


77 


PR00753 


1 - AMINOC YCLOPROP ANE- 1 - 
CARBOXYLASE SYNTHASE 
SIGNATURE 


PR00753E8.01 3.552e-ll 191- 
216 PR00753D 6.85 2.778e-09 
131-153 


78 


PR00506 


D21 CLASS N6 ADENINE- SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00506C 19.40 8.017e-09 96- 
119 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.571e-16 436- 
467 


84 


BL00675 


Sigma-54 interaction domain proteins 
ATP-binding region A proteins. 


BL00675A 24.86 8.800e-10 256- 
300 


85 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.286e-30 1 17-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 
328-364 


91 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.250e-17 10-35 
BL00215A 15.82 6.000e-16 221- 
246 BL00215A 15.82 7.857e-12 
108-133 BL00215B 10.44 9.526e- 
11 168-181 ! 


92 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.526e-24 324-367 


95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 1.000e-08 119- 
136 


96 - 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327B 19.84 2.091e-09 143- 
165 


97 


BL00752 


XPA protein. 


V%1 Amnn 1 C\ 1 *7 *T OAA« Art OO 

BL00752B 19.17 7.309e-09 28-72 


98 


PR00876 


NEMATODE MET ALLOTHI ONEIN 
SIGNATURE 


PR00876B 7.66 2.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 

T\/"YK jT A TXT CT/^XT A TT TD t? 

DOMAIN blUNATURJ} 


PR00109B 12.27 9.824e-12 122- 

141 


100 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.429e-31 1 18-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e-l 1 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.30 Oe- 10 229-246 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 6. lOOe- 10 258-275 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PR00048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-ll 
525-539 PR00048A 10.52 4.31 6e- 
11 44M55 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2. 125e- 1 0 569-579 
PR00048B 6.02 4.938e-10 513- 
523 PR00048A 10.52 5.696e- 10 
497-511 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
1.000e-09 457-467 PR00048B 
6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 1 1.94 5.364e-22 31-50 
PR00195B 9 47 1 783e-21 S6-74 
PR00195C 31.50 3.455e-21 126- 
144 PR00195D 11.76 8.7 14e-21 
175-194 PR00195F 16.20 8.500e- 
20 217-237 PR00195E9.82 
8.650e-20 194-211 


104 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.865e-09 121- 
148 BL01 1 13A 17 99 5 846e-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-l 1 70-99 
BL00420A 20.42 8.525e-10 73- 
102 BL00420A 20.42 5.708e-09 
85-114 


1 108 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-16 5-18 
PR00860C9.61 1.474e-14 41-51 


112 


BL01031 


Heat shock hsp20 proteins family profile. 


BL01031C 17.68 6.400e-10 122- 
147 


114 


DM01840 


kw SPAC24B1 1.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- 
103 DM01840A 10.95 9.571e-13 
31-43 


115 


BL01126 


Elongation factor Ts proteins. 


BL01126A 18.48 2.317e-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BL01126C9.20 9.735e-ll 
190-203 


116 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 4.375e-21 35-85 


118 


BL00437 


Catalase proximal heme-ligand proteins. 


BL00437A 18.82 1.000e-40 49- 
101 BL00437B 16.28 LOOOe-40 
114-168 BL00437C 21.86 LOOOe- 
40 190-239 BL00437D 25.72 
LOOOe-40 248-301 BL00437E 
23.95 1.000e-40 327-379 


119 


BL00140 


Ubiquitin carboxyl-terminal hydrolase 
family 1 cysteine activ. 


BL00I40D 22 64 8274e-14 164- 
208 BL00140C 11.80 5.444e-10 
77-102 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 
148 


122 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 LOOOe-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D 7.95 2.906e-09 24-41 
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NO. 
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RESULTS* 






BINDING (CREB) PROTEIN 

MLjIn A 1 UKJb 




124 


PR00041 


CAMP RESPONSE ELEMENT 
dIJNIJJJNvj (CKfc,B) PKU 1 blN 
SIGNATURE 


PR00041D 7.95 2.906e-0924-41 


125 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


127 


PR00318 


ALPHA G-PROTEIN (TRANSDUCIN) 
SIGNATURE 


PR00318D 16.28 1.900e-34 219r 
248 PR00318B 14.79 3.455e-27 
168-191 PR00318C 12.09 7.000e- 
23 197-215 PR00318A7.84 
1.600e-19 35-51 PR00318E7.23 
2.500e-12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation factor 1 betatoeta'/delta chain 
proteins. 


BL00824B 9.21 7.750e-22 133- 
153 


131 


BL00824 


Elongation factor 1 beta/beta'/delta chain 
proteins. 


BL00824C 14.58 1.000e-40 166- 
204 BL00824D 14.04 1.621e-38 
204-239 BL00824B 9.21 7.750e- 
22 133-153 BL00824E 12.49 
1.000e-19 247-263 


132 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


134 


PR00708 


ALPHA- 1 -ACID GLYCOPROTEIN 
SIGNATURE 


PR00708D 14.67 1.000e-27 141- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15.15 2.1 74e- 
24 73-95 PR00708E 13.33 
1.600e-21 189-207 PR00708A 
14.40 2.636e-21 51-70 


135 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 8.468e-13 126- 
145 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 
217 


137 


BL00471 


Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat. 


BL0047 1 23 .92 7.480e- 1 0 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 11.39 9.018e-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 
1027 


143 


PR00979 


TAFAZZIN SIGNATURE 


PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11.91 8.773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12.38 7.955e- 
19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 17.7K. 


DM00686C 14.14 7.720e-09 111- 
131 


146 


PR00604 


CLASS IA AND IB CYTOCHROME C 
SIGNATURE 


PR00604D 15.86 1.000e-17 87- 
104 PR00604B 12.73 9.591e-16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 11.13 8.800e- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 44-52 PR00604F 8.60 l.OOOe- 

10 193-177 


147 


BL00107 


Protein kinases A TP-binding region 
proteins. 


BL00107A 18.39 3.864e-15 266- 
297 BL00107B 13.31 6.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-8 1 " 


149 


PR00069 


ALDO-KETO REDT JOT A W 
SIGNATURE 


JfKUVVWD 19.36 1.857e-30 187- 
zi / jPKUUUoyA 16.01 7.429e-25 
41-66 PR00069E18.143.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 

R 071 1 O 1 0l T")fk 


150 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.688e-27 139-182 


151 


PD02906 


SYNTHASE I PSEUDOURJDYLATE 
PSEUDOURIDINE LYASF TR 


PD02906C 24.17 7.070e-22 165- 
zuu I^UUzyOoB 15.35 8.393e-15 

11/1 1 97 DFWx7QA£ A 1 A O A £. cc\r\ 

rJJU29UoA 10.84 6.500e- 

09 71-84 


153 


BL00479 


Phorbol esters / diacvlfflvrprnl hinHmo 

domain proteins. 


r>l>UU4/9A lV.oo 5.091e-12 891- 
914 BL00479B 12.57 1.837e- 11 

Q1S-Q31 


158 


BL00027 


'Homeobox 1 domain proteins. 


BL00027 26.43 6.786e-31 143-186 


160 


BL00422 ' 


Granins proteins. — ~ 


di_.IKJ4zzl, lo.lo 7.750e-12 420- 
448 


162 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 9.297e-ll 62-82 


164 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 6.182e-10 347- 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 1.000e-18 61-74 
PR00860C 9.61 1.900e-15 97-107 


167 j 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.57 le- 
27 510-535 BL00514E 14.28 
1.273e-l 6 388-405 BL00514D 
15.35 9.100e-15 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 11.65 9.690e-14 
416-431 BL00514A 11.68 8.200e- 
11 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39268- 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 . 
1.273e-16 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11 101-111 


171 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514G 15 98 2 ?41e-34 385- 
415 BL00514H 14.95 6.571e-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-3 19 BL00514D 
15.35 9.100e-l 5 283-296 
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SEQ 
ID 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00514B 16.42 4.857e-14 212- 

*"100 T» T AAC1 AT* 1 ^ £C A f r\r\ t a 

228 BL00514F 11.65 9.690e- 14 
330-345 BLUU314A 1 1.68 8.200e- 
11 101-111 


1 73 

1 fj 


rt norm 


'Homeobox' domain proteins. 


t>t nnno'7 o/c /ti o /inn** on nn i r<\ 
BLUUU27 zo.43 9.4Uue-29 1 19-162 


174 


DM01970 


0kwZK632.12YDR313C 

TmT\r\^r\\A ai ttt 
JbiNL/UoUMAJL tu. 


DM01970B 8.60 5.1 19e-15 1391- 
1404 


176 


BL00773 


Chitinases family 19 proteins. 


BL00773C 9.42 8.000e-092-16 


1 CO 


Dunni no 
rKUU 1 Uy 


1 YKUMNE isJJNAob CATALYTIC 
DOMAIN SIGNATURE 


TVT> AAlAAn 1 A1 A i a i A •» 

PR00109B 12.27 9.163e-14 141- 
160 


1 

loJ 


Jr JJU iyi 1 


DN A FRO 1 bIN POLYMERASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


183 


bLU0o45 


CAP-Gly domain proteins. 


BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 


100 


T)T) AA/I CO 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 525- 
541 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e-ll 497- 
513 


188 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01 803A 10.51 1.000e-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 


190 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194C 6.38 1.900e-35 145- 
174 PR00194E8.74 3.250e-30 
231-257 PR00194D9.57 1.500e- 
26 175-199 PR00194B 10.24 
5.200e-24 120-141 PR00194A 
7.86 4.857e-21 84-102 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A 21.13 5.909e-09 
94-121 


193 


T\T> AAA<*» t 

PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.31 2.200e-10 2-15 


i nc 
195 


BL00463 


Fungal Zn(2)-Cys(6) bmuclear cluster 
domain proteins. 


BL00463 8.22 5.071e-09 111-123 


196 


Tin A A iio 

PR001 18 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e-09 165- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 5.424e-09 234- 
267 


198 


BL00660 


Band 4. 1 family domain proteins. 


BL00660A 31.50 5.500e-ll 714- 
767 


199 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.820e-13 70-93 


OAA 

202 


TIT) AAAAA 

PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16.83 
8.000e-ll 1008-1018 PR00009C 
14. 1 1 1 .882e-09 892-904 


203 


BL00025 


P-type Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


OAC 

205 


T>7 AAA 1 O 

BL00018 


EF-nand calcnim-binding domam 
proteins. 


BL00018 7.41 7.300e-10 165-178 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


907 




P-type 'Trefoil' domain proteins. 


TJT A AAA C 1 n 1 *7 *> /tlO „ ~\r\ in /-ft 

BL00025 17.17 3.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A 25.82 6.192e-29 
14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.605e-25 279- 
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SEQ 
ID 

NO: 



212 



213 



214 



215 



216 



224 



228 



230 



232 



233 



236 



238 



ACCESSION 
NO. 



PCTYUS01/04098 



DM01206 



PD01941 



DESCRIPTION 



RESULTS* 



CORONA VIRUS NUCLEOCAPSID" 
PROTEIN. 



BL00362 



TRANSMEMBRANE 

COTRANSPORTER SYMP. 



.505 PR00138C 16.41 3.000e-24 
218-247 PR00138E6.018.714e- 
13 314-328 PR00138A 15 14 
9.538e-13 134-148 PR00138B 
15.82 4.522e-12 188-204 



DM01206B 10.69 8.429e-12 38"£ 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
S.068e- 10 388-408 



BL00115 



BL00038 



BL01108 



BL00018 



PF01329 



BL00211 



PR00761 



Ribosomal protein S15 proteins. 



Eukaryotic KNA polymerase II 
heptapeptide repeat proteins. 



Myc-type, *helix-loop-helix' dimerization 
domain proteins. 



FD01941A 14.81 1.000e-40 163- 
217 PD01941B 15.02 9.705e-30 
420-467 PD01941E 15.92 8.714 e . 
23 837-884 PD01941C 19.96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-16 661-710 
PD01941F 28.52 9.645e-15 1005- 
1060 



BL00362 24.67 8.3 13e-09 330-373"" 



Ribosomal protein L24 proteins. 



BL0011523.122.125e-09 1178-"" 
1227 BL001 15Z 3.12 6.096e-09 
1164-1213 



BL00038B 16.97 7.600e-18 125-" 
146 BL00O38A 13.61 1.474e-13 
102-118 



BL01108A 20.33 2.241e-22 49^82" 
BL01108B 11.40 8.457e-10 96- 
107 




Actin-depolymerizing proteins, 



EF-hand calcium-binding domain 
proteins. 



Pterin 4 alpha carbinolamine dhydrataseT 



ABC transporters family proteins 



XINDIN PRECURSOR SIGNATURE 



PR00049 



BL00412 



WILM'S TUMOUR PROTEIN " 
SIGNATURE 



BL01210 



Neuromodulin (GAP-43) proteins. ' 



SL00325B 21.66 1.000e-40 93:' 
139 BL00325A 24.83 9.333e-24 
61-93 



15L00018 7.41 1.450e-10 23 1-244 



PF01329B 18.52 1.69 2e-18 67-W 



BL00211B 13.37 6.250e-18 1033- 
1065 BL00211B 13.37 8.875e-18 
2045-2077 BL00211A 12.23 
1.900e-09 931-943 



PR00761A 5.81 9.366e-09 275- 
292 



PK00049D 0.00 3.500e-I0 54-69 



Caveolins proteins. 



Ribosomal protein Lie proteins. 



BL01252 



Endogenous opioids neuropeptides 
precursors proteins. 



BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16.54 4.122q-09 
133-184 



BL01210B 13.92 8.129e-09 106- 
156 



BL00939F 17.27 5.393e-09 861" 
891 



i5LU1252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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SEQ 
ID 


ACCESSION 
NO. 


JUJbSCRIrTlOIN 


RESULTS* 








37-67 BL01252C 18.10 1.621e-21 
164-190 BL01252A 14.22 7.107e- 

1 O 1 AJXA 


239 


BL00302 


Eukaryotic initiation factor 5 A hypusine 
proteins. 


BL00302 14.81 1 .000e-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
\r jla v \Jr i\AJ l J&JJN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
rlNCjbK MEI AL-BINDINU NU. 


PD01066 19.43 8.527e-25 11-50 


i 244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.746.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12 253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.651e- 
09 179-234 PF0079IB 28.49 
3.890e-O9 112-167 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e-ll 249-262 
PD00066 13.92 3. 423 e- 10 221-234 


247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B5.47 4.857e-14 
249-304 BL00406E8.44 l.OOOe- 
11 522-572 BL00406C6.75 
5.449e-ll 313-368 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 1.000e-40 1 12- 
161 BL00951A 15.10 7.750e-39 
21-57 BL00951D 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 
31 57-88 


252 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.818e- 
14 182-209 BL01113A 17.99 
l/730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BL01 1 13A 17.99 6.077e-12 203- 
230 BL01113A 17.99 9.182e-ll 
179-206 BL01113A 17.99 2.532e- 
10.176-203 BL01113A 17.99 
9.043e-10 218-245 BL01113A 

1*7 t\l\ f\ A^£~. 1A AAA O"*^ 

17.99 9.426e-10 209-236 
BL01113A 17.99 4.1 15e-09 137- 
164 


OCT 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 1.837e-21 466-491 


259 


PR00248 


METABOTROPIC GLUTAMATE 

HpfT? QTfiTsJATT TOP 


PR00248G 12.67 2.688e-09 53-78 


260 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3. 400e- 10 441-452 
BL00678 9.67 5. 8 OOe- 10 481-492 
BL00678 9.67 8.800e-10 358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 



c 

160 
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SEQ 
ID 

NO: 



262 



263 



264 



265 



278 



282 



283 



286 



287 



289 



293 



295 



296 



ACCESSION 
NO. 



BL50002 



BL00049 



PD01469 



DESCRIPTION 



Tip-Asp (WDJ repeat proteins proteins. 



RESULTS* 



BLU0678 9.67 8.800e-i0 332-343 



Src homology 3 (SH3) domain proteins 
profile. 



Ribosomal protein L14 proteins. 



BL00678 9.67 3. 400e- 10 468-479" 
BL00678 9.67 5.800e-10 508-519 
BL00678 9.67 8.800e-10 385-396 
BL500G2B 15.182.200e^0415^ 



429 



CiLYCOPROTEIN PROTEIN 
PRECURSOR SA. 
CiLYCOPROTEIN PROTEIN " 



BL00Q49C 17.38 3.040e-12 94- — 
130 

PD01469 20.59 2.09 le- 14 438-470 




PD02712 



BL00678 



DM00892 



BL00048 



PR00081 



PR00310 



FD01066 



BL00979 



Ubiquitin carboxyl-terminal hydrolase 
family 1 cysteine activ. 



ELEMENT TRANSPOSASE FOR 
TRANSPOSON TRANSPOSABLE. 



lip-Asp (WD) repeat proteins proteins 
3 KETRO VIRAL PROTEINASE. r ~ 



JBL00140D 22.64 1.000e-40 161- 
205 BL00140C 11.80 9.053e-30 
79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4.649e- 
17 37-55 



PD02712A 23.03 8.013e-09 47-83 



BL00678 9.67 1.474e-0 9 lOOTiT 



Protamine Pi proteins. 
CiLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 

AN Tl-PROLIFERA II VE PROTEIN 



BTG1 FAMILY SIGNATURE 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



Ci-protein coupled receptors family 3 
proteins. 



DM00892C 23.55 4.767e-21 864- 
898 



BL00048 6.39 9.550e-09 56-83 



PK00081A 10.53 1.878e-l 136-54 



PR00310B 10.59 4.23 le-17 29-59' 
PR00310D9.10 6.679e-16 89-1 19 



PD01066 19.43 7.000e-36 37-76 " 



BL01064 



BL00030 



PROTCIN TRANSCRIPTION 

REGULATION NUCLEAR 
Pyndoxamine 5»-pnosphate oxidase 
proteins. 



iiulcaryotic RNA-binding region RNP- 
proteins. 



BL00979L 20.63 3.800e-12 111- 
152 



FDU241 1 21.89 7.000e-16 195-229 



BL01064A 27.84 8.313e-28 77- 
129 BL01064C 15.22 7.136e-25 
202-235 



BL00030A 14.39 2.929e-13 37-56 
BL00030B 7.03 1.900e-ll 167- 

177 BL00030A 14.39 2.000e-10 
128-147 



161 
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SEQ 

TT\ 

ID 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 


298 


BL01183 


ubiE/COQ5 methyltransferase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 

1 DO 

188 






Protein-L-isoaspartate(D-aspartate) 0- 
methyltransferase signa. 


BL01279A 24.27 5.862e-l 1 57- 
105 


3U1 


bLU0191 


Cytochrome b5 family, heme-binding 
domain proteins. 


BL00191K 17.38 4.95 le-27 184- 
228 BL00191JI1.37 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF0H40 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 


307 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 4.818e-21 59-81 
PR00245C 7.84 5.154e-20 238- 
254 PR00245D 10.47 4.000e-15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 


309 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 1 1.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 1 10- 
136 BL00380G 11.26 5.800e-16 
267-280 BL00380B 14.77 7.000e- 
14 49-62 BL00380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000e-ll 181-193 BL00380A 
10.48 1.000e-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 1 .000e^0 50- ! 
105 BL00227C 25.48 1.000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
1.000e-40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301 e- 
15 116-164 BL00232B 32.79 
6.769e- 13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 
346 BL00232C 10.65 3.942e-10 
433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A9.56 6.000e-15 2- 
15 


330 


PR00391 


PHOSPHAHDYLINOSITOL 
TRANSFER PROTEIN SIGNATURE 


PR00391E 12.50 7.785e-15 21 1- 
231 PR00391B8.391.000e-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391 A 7.83 
Djyue- li 16-36 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 \ 


337 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD027H 


SYNTHASE 


PD02711B 14.26 1.973e-20 944- 
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351 



366 



369 



370 



371 



373 



379 



BL00018 



^-CYCLIC NUCLEOTIDE CLASSlT 
P HOSPHODIESTERASE SIGNA TTTpf 



PD01498 



th-nand calcium-binding domain 
proteins. 



irp-Asp (WD) repeatn mtpinc 

CUKUNAVIRUb NUCLEOCAPSID ' 
PROTEIN. 



rxuittsSA 10.45 2.778e-09 86- 
105 

^MK)18 7.413.118e-Il 160-173 



BL00018 7.41 2.350e-1O JAA.^i 



BL0U678 9.67 1.94/e-09 256-267 



OXIDASE BIOSYNTHESIS 
O XIDOREDUCTASE PORP. 



OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP 



Aminoacyl-transter RNA synthetases 
class-I proteins. 



BL00523 fsultatases proteinr 



— — _ ~ ' ■ ' w v ^ **+>\J-^\} , 

DM01206B 10.69 3.278e-09T75- 
195 DM01206B 10.69 6.696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.316e-09 177- 
197 



KUU1498C 24.90 6.880e-14 219- 
263 



PD01498C 24.90 6.880£l42l9T 
263 



BL00107 



BL00107 



BL00279 



Protein kinases ATP-binding re3oT 
proteins. 



Acyl-CoA-binding protein. 



Protein kinases A TP-binding region 
proteins. " 



Ul-00178B7.11 1.000e-11589- ~ 
600 BL00178A 14.23 8.500e-09 
46-56 

BL00523E 19.27 l.OOOe^TJTF^ 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B8.64l.964cl3 
78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G9.46 5.500e- 
10 506-516 

BL00107A 18.39 4.81 8e-09 21 -52 



BL00880 17.52 1.000 ^ 0 7^-19^ 



GLUTELIN SIGNATURE" 



Membrane attacFcomplex components / 
perforin proteins. 



— — T» I U~ IjZ.J 

Bw»uiU7A 18.39 1.000e-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 



.PKU0211B 0.86 6.602e-l 1 326- 
347 PR00211B0.86 6.106e-10 
320-341 PR0021IB0.86 3.167e- 
09333-354 



PROTEIN ZINC FINGER ZINC-" 
F INGER METAL-BI NDING NT T 
PROTEIN ZINC FINGER ZINC~ 

VTKJn'CTi H rr-^ j v 



BL00279E 37.1 1 9.349e-T0749: 
797 



PD01066 19.43 1.231e-33 10-49 



Vl>01066 19.43 7.563e-28 10-49 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


380 


PR00413 


TT A T /""V A /"I I |\ 

HALOACID 

DEHALOGENASE/EPOXIDE 
HYDKOLAbb rAJVULY blGNATURb 


PR00413D 1 1.28 8.941e-09 864- 
878 


383 


PR00413 


HALOACID 

•p\T?TT a t r\r^'c\i a cc /T?"r»/*vvrr\T? 

DrJULALOGENAbJb/BrOXIDE 
HYDROLASE FAMILY SIGNATURE 


PR00413D 1128 8.941e-09 864- 
878 




nr ni f\£.f\ 

BLOlUoU 


Flagella transport protein flip family 
proteins. 


BL01060A 15.65 L535e-09 131- 
174 


388 


PR00209 


a t TIT T a /^^>^ -,| T , a f~* t t a t\t\i t-> a h m \/ 

ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 


PR00209B 4.88 6.3 18e-l 1 1009- 
1028 


1 Oft 

389 


TVT> Aft 001 

PK00837 


ATT T?T» r'T^XT \ TC /TTW 1 T - * A A ATT XT' 

ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837B 11.64 1.000e-10 469- 
483 


391 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 7.907e-10 118- ! 
142 


392 


PR00014 


FIBRONECTIN TYPE III REPEAT 

nto\r A TT TT"> l""* 

SIGNATURE 


PR00014D 12.04 8.412e-10 691- 
706 


393 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 


395 


BL00634 


Ribosomal protein L30 proteins. 


BL00634 34.38 4.090e-13 70-121 


396 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A 25.14 7.231e-21 
45-81 BL01013C9.97 1.000e-13 
132-142 BL01013B 11.33 l.OOOe- 
11 110-121 


397 


BL00930 


Peripherin/rom-1 proteins. 


BL00930E 17.80 1. 000 e^40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
133 


400 


PR00780 


LEUSERPIN 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 


401 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e-ll 4-20 


403 


BL00381 


Endopeptidase Clp serine proteins. 


BL00381C 23.84 1.250e-32 150- 
194 BL00381A 16.48 2.286e-22 
74-111 BL00381B 21.42 8.326e- 
14 78-130 


405 


BL0H05 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 1 .000e-40 4-49 
BL01105B 12.95 1.000e-40 68- 
108 


406 


BL00344 


GATA-type zinc finger domain proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR002H 


GLUTELIN SIGNATURE 


PR0021 IB 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TRS domain proteins. 


BL00762A 23.43 1.000e-28 752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A23.43 5.415e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e«15 262- 
280 BL00690A6.87 1.818e-13 
230-240 


415 


T"»T ftA^IT 

BL00227 


Tubulm subunits alpha, beta, and gamma 
proteins. 


BL00227B 19.29 1.000e-40 52- 
107 BL00227C 25.48 L000e-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
1.000e-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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432 



433 



434 



437 



440 
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PD00930 



Kazal serine protease inhibitors family 
proteins. 

m&TEW GTPASE DOMAIN 

ACTIVATION. 



PD01066 



PR00449 



PR00120 



BL00115 



PKUIblN ZINC FINGER ZINCl 

FINGER METAL-BINDING NU 
1 KAN SFORM1N G PROTEIN P2TRAS~ 

STOMA TT TOT? 



SIGNATURE 

H+- TRANSPORTING ATPASE 



BL00282 16.88 8.87!>e-12 464~487 



ruuU930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.61 7e- 12 
125-151 PD00930B 33.72 2 52 le- 
10 214-255 



PD01066 19.43 4.649e-34 34-73 



PR00449A 13.20 7.563 e- 1 156^78~ 



(PROTON PUMP) SIGNATU RE 

KlllrarvrtfiV 1>XT A 1 



Eukaryotic RNA polymerase n ' 
heptapeptide repeat proteins. 



«*UUI20C 9.90 5.800e-19 705- 
722 



PF00628 



PD01066 
PR00309 



FHD-finger. ~ — 

PROTEIN ZINC FINGER ZINC- ' 



FINGER METAL-BINDING NU 
AKKESTIN SIGNATURE 



BUWl 1ST 8.46 7.273e-29 1208- 
1242 BL00115Q18.082.776e-21 
953-983 BL00115Y11.86 8.000e. 
17 1604-1650 BL00115M 19 19 
8.130e-16 731-774 BL00I15H 
14.34 9.392e- 16 463-496 
BL001 15A 15.44 7.414e-15 43-82 
BL00115R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e-14 
591-617 BL001 151 8.33 4.336e- 
13 535-590 BL00115L 12.25 
5.939e-13 662-694 BL00115G 
13.65 6.01 le-13 435-463 
BL00115K 15.03 3.417e-10 617- 
659 BL001 150 16.76 5.805e-10 
863-913 BL00115Pll.54 7.538e- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
10.34 4.475e-09 1242-1265 



m)Q628 15.84 4.536e-10 219-234 



1-UU1U66 19.43 6.351e-34 10-49 




FK00309A 9.68 5.250e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B7.812.800e-21 
69-88 PR00309C8.22 1.621e-19 
165-183 PR00309E9.82 9.438e- 
15 374-389 
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SEQ 
ID 

MA. 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
5\jo-5Zd oLUUoUOr o./7 8.105e- 
12 271-284 BL00600E 16.43 
3. 167e- 11 228-257 BL00600D 
8.71 8.650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972A 1 1.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 1.000e-40 8-54 
BL00349C9.33 1. 000 e-40 82-125 
BL00349E 10.79 1.000e-40 152- 
1 95 BL00349F 11.81 1 . 000e-40 
213-255 BL00349H 15.70 7.387e- 
36 361-399 BL00349B 10.51 
2.227e-34 54-82 BL00349D 1 1 .70 
9.100e-34 125-152 BL00349G 
19. 11 j./ole-3U jzi-Joo 


445 


BL00154 


E1-E2 ATPases phosphorylation site 
proteins. ' 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.882e- 11 82-115 
DM00215 19.43 6.492e-09 87-120 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e-40 112- • 
160 BL01283D 11.70 6.000e-39 
253-286 BL01283B 23.17 6.538e- 
38 170-212 BL01283C 13.05 
7. 750e-19 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-ll 3-26 


453 


PR00162 


RIESKE 2FE-2S SUBUNIT 
SIGNATURE 


PR00162B 12.77 7.429e-17 215- 
228 PR00162A9.35 2.324e-14 
193-205 PR00162C 8.10 7.120e- 
14 227-240 


A SA 


rJJU 1 Uoo 


rKO 1 hlN Z1N O Jb INCjbK Z1N C- 
FINGER METAL-BINDING NU. 


TiT-VA 1 A ZT ZT 1 A A 1 *7 AAA— OAOT 1 ^ 

PD01066 19.43 7.000e-30 87-126 


456 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.333e-18 1149- 

1 1 AO 

1 192 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


A CO 

459 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290A 20.89 1.529e-14 154- 
177 BL00290B 13.17 9.000e- 12 
214-232 


460 


PR00413 


HALOACID 

HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 

O 1 A T»T> AA /I IODIC TO C '71/1^ A A 

214 rR00413b 15.78 5.714e-09 
175-192 




PT?fMY7<0 

rKUU / jy 


D A CT/*^ T>"D HTC A OT? /VT TXTTTV T*\/~D"C\ 

UAoiu JrKUxr.AoJb ^JvUJNl I 1 YJrJbJ 
INHIBITOR FAMILY SIGNATURE 


TVD AATC A"D 11 liC O ^OC* AA "7>i O C 

rKA)0759h> 1 1.26 8.385e-09 74-85 


466 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


467 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


469 


PR00153 


CYCLOPHILIN PEPTIDYL-PROLYL 

nip TT) AXTP TO/"Vfc ATT*T% A OP 

CIS-TRANS ISOMERASE 

Oi.VJIN.rV 1 vj XVJD 


PR00153D 1 1.99 3.250e-15 510- 
523 PR00153C 11.01 4.682e-14 

14 523-539 PR00153B 11.57 
1.720e- 13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL00491C 12.15 3.912e-09 557- 
572 


471 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 


PD00289 9.97 1.000e-14 1482- 
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495 



ACCESSION 
NO. 



PCT/US01/04098 



RESULTS* 



wongation fector 1 gamma chain profit 




1496 l«D002899.97 8.650e-ll" 
1122-1136 



BL50040D 17.41 l.OOOe-40 279- ~ 
329 BL50040E 18.79 1.000e-40 
333-388 BL50040F 18.99 5 320e 
40 390-428 BL50040C22 62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 I.450e-14 10-22 



^.-r-ic-xi Joy- 

611 PR00007B 14.16 3.500e-21 
544-564 PR00007A 1933 6 897e- 
20 517-544 PR00007D 9 64 

6.571e-12 623-634 

BL50002A 14.19 5.84 6 e -10 170- " 



UMU1970B 8.60 9.i>00e-17 967- 
980 



FK00868C 13.76 5.688e-17 284-" 
308 PR00868A 16.33 3.1 86e-13 
224-247 PR00868H 12.51 3.388e- 
13 431-448 PR008681 10.87 
7.938e-l 1462-476 PR00868E 
13.19 1.608e-10 340-366 



PF00023A 16.03 9.625e-10 760-' 
776 PF00023A 16.03 3.571e-09 
715-731 



^02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 



BL00211 



BL00027 



BL00383 




— v./ouc-u JVV-DDZ 

BL00027 26.43 9. 143e- 12 319-362 
BL00027 26A3 2.600^11 627-670 
BL00027 26.43 3.625e-10 779-822 



HUJU107A 18.39X800^222141 
245 BL00107B 13.31 1.000e-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B13 31 
8.615e-12 652-668 
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SEQ 
IB 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins. 


1913 BL00383D11.92 3.077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e- 13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019B 11.36 4.600e- 
09 160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 1.000e-40 367- 
414 BL00226B 23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e-13 309-340 BL00226C 
13.23 6.143e-12 266-297 
BL00226B 23.86 1.209e-09 146- 
194 


505 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407F 7.61 6.739e-09 916- 
930 


506 


PF00632 


HECT-domain (ubiquitin-transferase). 


PF00632C 20.66 9.830e-19 991- 
1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4.273e-20 76-1 16 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Tip- Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
7.618e-10 846-861 PR00320A 
16.74 3.415e-09 763-778 
PR00320A 16.74 6.268e-09 567- 
582 


511 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e-12 170- 
183 


512 


BL50058 


G-protein gamma subimit profile. 


BL50058 27.23 7.494e-09 10-58 


513 


BL00524 


Somatomedin B domain proteins. 


BL00524A 9.65 8.925e-14 80-101 


515 


BL00041 


Bacterial regulatory proteins, araC family 
proteins. 


BL00041 23.99 1.964e- 19 492-524 


516 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.500e-13 391-404 


517 


BL00415 


Synapsins proteins. 


BL00415E 4.82 9.291e-09 959- 
996 


518 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.471e-12 126- 
145 


519 


BL00290 


Immunoglobulins and major 
histocompatibility complex proteins. 


BL00290B 13.17 4.750e-09 47-65 


522 


PR00505 


D12 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 


PR00505A 14.15 7.128e-09 364- 
381 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 9.22 5.781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e- 17 131- 
150 PR00254A 11.23 4.706e-14 
61-78 PR00254C11.36 4.000e-12 
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ID 
NO: 



531 



532 



546 



547 



554 



555 



558 



ACCESSION 
NO. 



PCT/USO 1/04098 



DESCRIPTION" 



BL00741 



PR00193 



Guanine-nucleotide dissociation - 
stimulators CDC24 family sign 
MYOSIN HEAVY CHAIN 



SIGNATURE 



RESULTS* 



113-126 PR00254B lZ97L486eT 
1195-110 



BL0074IB 14.27 6.870M6787T 
810 



rtwviMD 14363A43e^U47^ 
476 PR00193C 12.60 7.632e-32 
216-244 PR00193B 11.69 7 750e- 
29 167-193 PR00193A 15 41 
2.588e-22 111-131 PR00193E 
19.47 2.200e-21 501-530 




PF00642 



BL00383 



BL01226 



Zinc ringer c-x8-C-x5-C-x3-H type (and" 
similar). 

lyrosine specific protein phosphatases — 



proteins. 



mJUU23A 16.03 7.857e-ll 138-" 
154 



n<UU642 1 1.59 9.082e-10 838-849" 



Hydroxymethylgluteryl-coenzyme A~ 
synthase proteins. 



SL0U383E 10.35 4.U5e-10 104- " 
115 



BL0Q383 



PR00403 



l yrosine specific protem phosphatases" 
proteins. 

WW DOMAIN SIGNATURE 



m.uiZ26A 13.79 1.000e-40 50-89" 
BL01226C 13.51 1.000e-40 127- 
167 BL01226D 11.60 L000e-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
1.000e-40 386-434 BL01226I 
25.06 L000e-40 460-508 
BL01226G 15.76 3.483e-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 



UMU1930E 15.41 1.367e-37 170- 
215 DM01930F 14.16 8.232e-28 
267-303 DM01930B 19 86 
9.163e-10 37-71 



BL00195B 15.31 7.158e-09 9~^9~ 



"PR00380 rmtSW HEAVY CHAIN 
_ 1 SIGNATURE 




BL00383E 10 35 2.756e-12436- 
447 



PR00403B 12.19 7.612e-ll 122- " 
137 PR00403A 16.82 3.912e-10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








297 PR00380C 13.18 5.154e-20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 


559 . 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 5. 333e-09 522-531 


561 


PD01795 


PROTEIN AMINOPEPTTDASE 
PRECURSOR HYDROLASE SIGNA. 


PD01795B 11.56 2.333e-12 159- 
172 PD01795A 10.27 1.000e-09 
135-144 


562 


PD01795 


PROTEIN AMINOPEPTTDASE 
PRECURSOR HYDROLASE SIGNA. . 


PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 1.000e-09 
86-95 


563 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 1.391e-09 41-54 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301B 5.49 4.1 15e-09 284- 
295 


569 


PF0085O 


Histone deacetylase family. 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1.1 18e-ll 
794-827 PF00850G 22.75 8.375e- 
1 1 833-875 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.800e-l 1 44-53 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


PF01140 


Matrix protein (MA), pl5. 


PF01140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 
71-95 BL00284B 17.99 7.26le-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.429e-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.553e-29 15-54 


580 


BL50001 


Src homology 2 (SH2) domain proteins 
profile. 


BL50001B 17.40 4.500e-12 1010- 
1031 


581 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 3.189e-22 608- 
649 PD00930A 25.62 6.806e-17 
505-531 


584 


t*»t r\r\ f i >s 

BL00612 


Osteonectin domain proteins. 


BL00612B 1 1.35 2.034e-ll 93- 
126 


585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PF00628 


PHD-finger. 


PF00628 15.84 3.455e-12 235-250 


587 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


GTP1/OBG GTP-BEKDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.525e-16 227- 
248 PR00326C 9.79 6.760e-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.229e- 13 248-267 


con 


BL00422 


Granins proteins. 


BL00422A 28.34 7.429e-09 2349- 
2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e-ll 110- 
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132 



596 
"597" 



600 



606 



608 



610 



615 



616 



PR00049 
DM00547 



WILM'S TUMOUR PROTEIN 
SIGNATURE 



PR00049D 0.00 3.136e-09 31-46 



1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 



PD01066 




PR00019 



PR00019 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



DM00547C 17.30 1.667e-19 207- 
229 DM00547E 13.94 6.200e- 18 
319-342 DM00547B 11.28 

I. 000e-17 179-193 DM00547D 

II. 60 9.250e-13 289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.818e-ll 
158-270 



Ribosomal protein L35 proteins. 



PD01066 19.43 1.882e-27 13-52" 

BL00192A 11.90 6.400e-09 39(T 
430 

BL00936B 27.27 8.615e-09 1 1 8- 
157 



LEUCINE-RICH REPEAT 
SIGNATURE 



BL00936B 27.27 8.615e-09 1183" 
157 



PR00320 



LEUCINE-RICH REPEAT 
SIGNATURE 



PK00019B 11.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 



BL00750 



G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 



PR00019B 11.36 7.300e-10 292" 
306 PR00019A 11.195.667e-09 
323-337 



Chaperonins TCP-1 proteins. 



BL00766 



BL00256 



Tetrahydrofolate 

dehydrogenase/cyclohydrolase proteins. 



PR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 



BL00750B 16.17 1.000e-40 70- 
120 BL00750A 20.07 6.21 le-37 
26-69 BL00750G 20.12 8.800e-31 
431-471 BL00750F18.40 5.125e- 
30 370-411 BL00750E 24.59 
8.650e-29 295-332 BL00750H 
21.44 1.000e-27 489-524 
BL00750C 25.65 5345e- 17 149- 
181 BL00750D 16.16 6.318e-14 
203-222 



BL00319 



Adipokinetic hormone family proteins. 



Amyloidogenic glycoprotein extracellular 
domain proteins. 



BL00766B 24.49 L000e-40 142- ™ 
190 BL00766E 13.78 1.000e-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.063e-24 102-132 



BL00256 12.28 3.298e-10 74(p755 



BL00319C 17.12 9.053e-09 419" 
453 



Eukaryotic RNA-binding region RNP-1 
proteins. 



BL00030A 14.39 4.429e-09 44-63 



620 



622 



BL00325 



Eukaryotic RNA-binding region RNP-1 
proteins. 



BL00030A 14.39 4.429e-09 44-63" 



BL00972 



Actin-depolymerizing proteins. 



Ubiquitin carboxyl-terminal hydrolases 



BL00325B 21.66 5.817e-16 77- 
123 



BL00972A 1 1.93 5.500e-19 2lT- 
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RESULTS* 



family 2 proteins. 



23 1 BL00972D 22.55 2.742e-16 
501-526 BL00972B 9.45 l.OOOe- 
1 1 297-307 BL00972C 16.48 
3.160e-ll 370-385 BL00972E 
20.72 7.5 17e-l 0526-548 



625 



PD01066 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PD01066 19.43 6.333e-39 6-45 



628 



BL00039 



DEAD-box subfamily ATP-dependent 
helicases proteins. 



BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15.63 1.844e- 
15 327-351 BL00039B 19.19 
5.636e- 14 242-268 



630 



PD00306 



PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 



PD00306A 10.26 7.000e-12 232- 
246 



631 



PD00306 



PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 



PD00306A 10.26 7.000e-12 290- 
304 



633 



BL00785 



5'-nucleotidase proteins. 



BL00785C 9.45 3.625e-16 108- 
122 BL00785E 15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5.500e-13 72-86 BL00785D9.89 
4.000e-12 135-145 



636 



PR00832 



PAXILLIN SIGNATURE 



PR00832E 14.43 9.901e-14 85- 
108 



637 



PR00109 



TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 



PR00109B 12.27 6.362e-13 221- 
240 



638 



PF00635 



MSP (Major sperm protein) domain 
proteins. 



PF00635B 15.84 4.900e-ll 463- 
502 



639 



PR00860 



VERTEBRATE METALLOTHIONEIN 
SIGNATURE 



PR00860B 7.04 L900e-18 85-99 
PR00860C 9.61 1.474e-14 99-109 
PR00860A 5.46 1.720e- 14 63-76 



641 



PD00066 



PROTEIN ZINC-FINGER METAL- 
BINDI. 



PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 
PD00066 



13.92 4.462e-15 
13.92 4.462e-15 
13.92 2.800e-14 
13.92 2.800e-14 
13.92 2.800e-14 
13.92 7.000e-14 
13.92 8.800e-14 
13.92 8.800e-14 
13.92 1.500e-13 
13.92 7.000e-13 
13.92 7.000e-13 
13.92 9.500e-13 
13.92 9.500e-13 
13.92 9.500e-13 
13.92 8.615e-10 
13.92 L600e-09 



271-284 
299-312 
327-340 
383-396 
411-424 
355-368 
439-452 
495-508 
551-564 
467-480 
523-536 
215-228 
243-256 
579-592 
607-620 
187-200 



642 



BL00961 



Ribosomal protein S28e proteins. 



BL00961B 1 1.24 7.429e-37 67- 
100 BL00961A9.90 4.079e-26 
42-66 



643 



BL00585 



Ribosomal protein S5 proteins. 



BL00585A 28.43 1.391e-40 103- 
155 BL00585B 18.78 3.250e-30 
193-230 



647 



BL00678 



Trp-Asp (WD) repeat proteins proteins. 



BL00678 9.67 9.400e-10 181-192 



648 



PR00876 



NEMATODE METALLOTHIONEIN 
SIGNATURE 



PR00876C 6.15 9.229e-09 1 12- 
126 



652 



PD01066 



PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 



PD01066 19.43 5.941e-27 29-68 



653 



BL00047 



Histone H4 proteins. 



BL00047A 13.53 1.000e-40 2-41 
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BL00047C 12.18 1.310e-38 74- 
104 


654 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.109e-25 30-69 


655 


BL01115 


1 G TP-binding nuclear protein ran proteins. 


BL011I5A 10.22 3.483e-17 19-63~~ 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
1 proteins. 


BL00518 12.23 8.286e-10 31-40 


658 


BL00125 


1 Serine/trjureonine specific protein 
phosphatases proteins. 


BL00125B 21.48 L000e-40 89- 
135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il l.OOOe- 
40 213-268 BL00125A 14.83 
8.941e-38 47-84 


659 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BENDI. 


PD00066 13.92 8.200e- 16 492-505 
PD00066 13.92 9.308e-15 380-393 
riJUUUDo J j.yz o.000e-13 352-365 
PD00066 13.92 7.000e-13 240-253 
rjjuuuoo i^.yz /.ju0e-13 268-281 
PD00066 13.92 7.500e~13 408-421 
PD00066 13.92 2.174e-l 1 464-477 
PD00066 13.92 l.OOOe- 10 436-449 


660 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 


BL00795 


Involucrin proteins. 


BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e- 13 
187-232 BL00795C 17.06 5.014e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-ll 185- 

230 BL00795C 17.06 2.000e-ll 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.111e-ll 197-242 BL00795C 
17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e- 11 189- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

23 1 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 

C OA A a AA 1 1 C OOA T">T f\t\*it\r i~\ 

D.ouue-uy I /D-220 BL00795C 
BL00795C 17.06 6.600e-09 201- 

746 RT nfl70^P 1 7 C\£ A /JnAa nn 

^"u Dju\j\j /7Ji/ 1 /.uo o.ouue-uy 
202-247 BL00795C 17.06 6.600e- 


662 


BL00469 | 


Nucleoside diphosphate kinases proteins. 


BL00469 22.22 1.000e-40 149-204 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.41 le-1 1331- 
385 


664 


RT ftfi^fil I 


Iryptopnan pentad repeat proteins (IRF 
family) proteins. 


BL00601A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 


665 
666 


BL00082 
DM01537 


Extradiol ring-cleavage dioxygenases 
proteins. 

lew SKI2W SKI2 NUCLEOLAR j 


BL00082A 19.07 8.615e-12 49-72 








3M01537B 21.63 4.073e-37 834- | 
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HELICASE. 


881 DM01537B 21.63 9,750e-21 
1669-1716 DM01537A 15.14 
8.650e- 18 698-718 DM01537A 
15.14 6.766e-l2 1537-1557 


667 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
8.650e- 18 684-704 DM01537A 
Ij.14 0. /ooe-lz 1523-1543 


669 


BL00107 


Protein kinases A TP-binding region 
proteins. 


BL00107A 18.39 6.786e-24 849- 
880 BL00107B 13.31 6.727e- 13 
916-932 


670 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84>9.735e-27 37-89 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.57 le- 12 432-475 


676 


PR00861 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key' 
motif proteins. 


BL00225B 1 8.06 7.5 1 7e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8.200e- 19 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.06 4,808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-21 12 BL00225A 
13.82 5.829e- 12 2043-2064 
BL00225A 13.82 3.1 27e-09 1759- 
1780 


679 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL002431 31.77 1.143e-ll 172- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H5.90 1.000e-29 612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 11.38 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1.85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 


685 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 7.500e-20 40-58 
BL00972D 22.55 3.903e-16 300- 
325 BL00972B9.45 1.000e-13 
120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


688 


BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23.14 1.000e-40 8-54 
BL00388B 31.38 3.864e-33 66- 
108 BL00388D 20.71 1.000e-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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TRAN. - ~ - ^ 


394 


691 


PD01572 


PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8.77 4.083e-09 1-31 


692 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.600e-10 488-505 


694 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013A 25.14 9.357e-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6.21 le- 
14 615-625 BL01013B1133 
3.605e-13 592-603 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-ll 2147- 
2161 PD00289 9.97 2.552e-09 23- 
37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e-09 282- 
302 


700 


PR00749 


LYSOZYME G SIGNATURE 


PR00749F 13 63 8 636t>-n no. 
156 PR00749H 8.22 3.68 le- 12 
173-194 PR00749B 16.54 1.419e- 
1148-70 PR00749C 7.26 3.060e- 
11 72-91 PR00749A 10.33 
4.815e-10 24-45 


703 

• 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR00704I 9.52 1.000e-29 476-505 
PR00704D 11.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1 7^7e-91 317-^^Q Pfc 0070414 
13.38 8. 138e-21 367-385 
PR00704A 1 4 68 2 1 25p-1 Q 97-S 1 
PR00704C 11.88 L257e-17 96- 
113 PR00704B 17.94 1.833e-15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e-09 94-1 1 1 


706 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 9.581e-26 369- 
416 BL00226B 23 86 3 250e-?4 
203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12 77 
8.200e-14 103-118 


707 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.440e-102-15 


708 


BL00361 


Ribosomal protein S10 proteins. 


BL00361B 18.34 5.101e-10 82- 
105 


709 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.200e-10 2-15 


710 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17 41 8 412e-27 160- 
197 BL00514E 14.28 8.909e-16 
219-236 BL00514H 14 95 1 551e- 
15 317-342 BL00514G 15.98 
7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33 77 8 714^-19 40-00 


714 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24.53 6.029e-17 158- 
202 BL00400D 23.26 2.080e- 14 
222-259 BL00400A 21.59 1.600e- 
10 27-59 


715 


BL01154 


RNA polymerases L / 13 to 16 Kd 


BL01 154B 24.55 5.500e-36 40-76 
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subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


716 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU . 


PD01066 19.43 9.786e-32 10-49 


717 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e-14 77- 
102 BL00215A 15.82 8.412e-10 
175-200 


719 


BL00309 


Vertebrate galactoside-binding lectin 
proteins. 


BL00309C 18.65 2.241e-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
316 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BL00687C 24.13 
6.087e-22 96-133 BL00687F 9.55 
2.500e-ll 352-363 


727 


DM01354 


kw TRANSCRIPTASE REVERSE II 
ORF2. 


DM01354N 13.17 1.000e-40 129- 
174 DM01354O8.73 6.605e-15 
180-226 


734 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BL 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.26 1.000e-40 22-69 
BL01024B8.91 1.000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13.22 1.000e-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F9.42 

I . 000e-40 266-3 1 7 BL0 1 024G 

II. 09 1.000e-40 317-349 
BL01024H 13.88 1.000e-40 389- 
442 


736 


PF00913 


Trypanosome variant surface 
glycoprotein. 


PF00913D 11.90 7.130e-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2.200e-09 82- 
101 


740 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 L600e-09 68-83 
PR00320A 16.74 7.366e-09 68-83 


743 


PR00871 


DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 


745 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.200e-15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8.851e-ll 
123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 
10.44 8.500e-09 165-178 


751 


BL50002 


Src homology 3 (SIB) domain proteins 
profile. 


BL50002A 14.19 LOOOe-14 370- 
389 BL50002B 15.18 2.200e-10 
408-422 


752 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 3.089e-12 390- 
440 


753 


PF00622 


Domain in SPIa and the RYanodine 
Receptor. 


PF00622B 21.00 4.214e-14 47-69 


754 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 8.941e-10 66-78 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e-17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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4.971e-15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00096A 10 41 1 S14e>_19 1Q7 
211 


756 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


RL01 1 87 A 9 OR 9 19Sp-19 ^9A 
336 BL01187A9.98 4.789e-ll 
377-389 BL01 187B 12 04 3 0<?7p 
10 439-455 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
.proteins. 


PF00651 15 00 4 490p-10zn-^6 


758 


PR00055 


HIV TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
156 


759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BIND I. 


PD00066 13.92 5.304e-ll 110-123 


760 


PR00448 


NSF ATTACHMENT PROTEIN 
SIGNATURE 0 


PROOzMRD 19 AD ^ zKSp 97 1 £9 

386 PR00448A 10.74 1.273e-22 

37-^7 PRft04ztRR 16 01 Q 37Ga 91 

100-118 PR00448C 11.46 l.OOOe- 
20 129-147 


765 


BL01042 


Homoserine dehydrogenase proteins. 


BL01042A 13.29 5.909e-ll 74-95 


766 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e- 16 57-78 


768 


BL00762 


WHEP-TRS domain nrnteim 


I3L.UU /OZA Z3.4.D O.jUUe-Zo 1 1Z- 

1 40 "RT ftft7/S9T5 1 A 1 4 *2 7Q5 a 1 
1^7 rSJLUU /OZD 10. 14 J. fyot-iZ 

64-78 BL00762A 23.43 6.625e- 12 
6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762Dll.15 9.667e- 
09 210-220 


769 


PR00709 


AVDDIN SIGNATURE 


PR00709A 4.60 1.934e-09 1-20 


770 


PR00320 


SIGNATURE 


rKUU3zUU Jo.Ui 1 . /zue-io ZOZ- 
211 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 

00 06-111 PRftft39ftR 19 10 

5.500e-09 262-277 PR00320A 
16 74 6 968p-00 5^-70 


771 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 8.714e-12 87- 
ioi PRnonioA 1 1 1 o i onriA in 

90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 
204 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 

6^9-67^ r>A/T0fK/17R 1198 

1.818e-18 518-532 DM00547C 
17.30 3.531e-17 546-568 

DM00S47A 19 ^JM 973p 1 1 407 

509 DM00547D 1 1 .60 9.200e- 1 1 
622-636 


776 


PR00779 


INOSITOL 1 4 5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR0077QP M 51 ^ 1zI7a ftO 7£Q 

792 


777 


PR00779 


INOSITOL 1,4,5-TRJSPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 


778 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5. 147e-09 742- 
765 
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779 


BL012S2 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 3.1l8e-ll 654- 
672 PR00205B 1 1.39 8.588e-l 1 
230-248 PR002G5B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
(RCC1) proteins. 


BL00625B 17.69 2.1 67e- 19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- 
16 140-174 BL00625B 17.69 
2.770e-16 245-279 BL00625A 
16.21 9.115e-16251-280 
BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


786 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 


787 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.738e-09 203- 
230 


788 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e-10 75-90 


789 


PR00102 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


PR00102B 14.82 5.418e-09 963- j 

977 ! 


790 


BL00030 


Eukaryotic RNA -binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-l 1 199- 
209 


791 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e-10 393- 
437 BL00415N4.29 2.117e-09 
103-147 BL00415N4.29 3.628e- 
09 97-141 BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.091e-36 105-144 


799 


PF0073 1 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 
380 PF00731B 1 9.47 7.429e-28 
299-336 PF0073IA 19.32 6.333e- 
24 268-297 


804 


BL00170 


Cyclophilin-type peptidyl-prolyl cis-trans 
isomerase signatur. 


BL00170B 20.97 8.071e-09 297- 
337 


805 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 378-389 
BL00678 9.67 5.800e-10 418-429 
BL00678 9.67 8.800e-10 295-306 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 7.571e-14290- 
318 


OAT 

807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09451- 
466 


AAA 


BL0Q1Q7 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.462e-12 564- 
595 


O 1 A 

810 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.3 lOe- 14 36-54 
PR00453B 14.65 8.568e-10 75-90 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


O I J 


rDUlOoo 


rKUiiiliN ZINC rlNOER ZINC- 
FINGER* METAL-BINDING NU. 


PD01066 19.43 2.047e-3I 16-55 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 


818 


PR00830 


ENDOPEPHDASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 115- 
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PROTEASE (S16) SIGNATURE 


135 


819 


BL00126 


3'5'-cyclic nucleotide phosphodiesterases 
proteins. 


BL00126C 22.07 7.857e-24 528- 
569 BL00126E 35.22 3.714e-15 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
l.OOOe- 12 502-514 BL00126A 
27.56 3.361e-09 461-498 


820 


PR00511 


TEKTESf SIGNATURE 


PR00511B 12.25 8.826e-22 174- ' 
195 PR00511A 13.59 7.723e-ll' 
155-172 


R91 


Jt3ijUU/4 1 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


89? 


t>T?AA'7{?A 


Domain found in NIKl-like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 231- 
261 


0^ / 




Eukaryotic RNA-bkding region RNP-1 
proteins. 


BL00030A 14.39 5235e-ll 144- 
163 


828 


BL00326 


1 Tropomyosins proteins. 


BL00326D 8.76 9.357e-ll 545- 
586 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 " 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 
1 52- 1 89 PD02448E 1 1 .33 9.000e- . 
30 235-261 PD02448F 14.22 
9.654e-25 279-303 PD02448D 
11.48 3. 659e-l 8 197-211 
PD02448G 10.73 7.857e-16 SOS- 
SIS 


~ 830 


BL00720 


Guanine-nucleotide dissociation 
stimulators CDC25 family sign. 


BL00720B 16.57 4.500e-23 483- 
507 


50 1 


.DJL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 6.625e-21 143- " 
174 BL00107B 13.31 4.214e-10 
213-229 


832 


BL00215 , 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.787e-ll 32-57 






NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497A 6.92 4.375e-09 41-59 




BL00229 ( 


Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


OJJ 


"DT AA/I O 1 1 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.216e-09 1053- 
1083 


836 


. BL00795 


Involucrin proteins. 


BL00795B 12.41 7.931e-09 405- 
445 


837 


PR00020 | 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 1.000e-17 34-53 
PR00020B 15.52 5.846e-16 68-85 
PR00020D 12.70 2.543 e- 15 147- 
162 PR00020C 13.66 3.483e-13 
95-107 PR00020E8.64 6.586e-13 
165-179 




"DT CAA1 1 \ 


Death domain proteins profile. 


BL50017B 17.60 6.897e- 13 1499- 
1515 


839 


PF00850 | 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 


840 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.500e- 12 44-60 
PF00023B 14.20 7.923e-ll 73-83 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 14.20 5.500e-09 

A A CA 

4U-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 1.000e-40 37-85 
BL01194C 12.35 9.250e-40 103- 
138 BL01194A 18.70 7.632e-38 
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2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodium:neurotransmitter syxnporter 
family proteins. 


BL00610A 17.73 1.000e-40 40-90 
BL00610B 23.65 1.000e-40 104- 
154 BL00610C 12.94 1.000e-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F 29.02 
1 .000e-40 454-509 BL00610D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BL00143 


Insulinase family, zinc-binding region 
proteins. 


BL00143A 20.91 4300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 141-156 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta/betaVdelta chain 
proteins. 


BL00824C 14.58 1.000e-40 129- 
167 BL00824D 14.04 6.1 92e-39 
167-202 BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 


849 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.000e-40 12-51 


850 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.316e-24 10-49 


852 


BL01272 


Glucokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.314e-25 
249-274 BL01272A 6.49 1.23 le- 
18 99-117 


853 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- 
106 


854 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.850e-ll 140-154 


858 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3.250e-25 68-90 
PR00450B 1 1 .76 8. 125e-23 22-42 
PR00450D 1 6.58 8.920e-22 92- 
112 PR00450E 12.14 1.581e-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 
4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 


860 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-1 17 


866 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 


BL01078 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 1420 1.621e-20 408- 
429 BL01078A 10.16 2.000e-13 
366-379 BL01078D5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.793e-ll 501-513 


Q/ZQ 
OOO 


T>T All *7*7 

JoJLrU 1177 
♦ 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 462- 
489 BL0H77C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16122-138 BLOl 177D 17.50 
1.900e-15 441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 415- 
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442 BL01177C 17.39 5.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 394-412 


871 


BL50007 


Phosphatidylinositol-specific 
phospholipase X-box domain proteins 
prof 


"RT ^flftn7A io ai i nnn~ Af\ o'v* 

ly.oi l .uUUe-40 322- 
368 BL50007D 19.54 1.000e-40 

36 383-421 BL50007E 25.63 

8.97 5.200e- 19 452-469 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


RT 00Q79D 99 1 9^n*» 1 7 an 
115 


874 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1.65 4.250e-09 370- 
386 


877 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 5.500e-13 1343- 
1366 


878 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM0091 S 10 43 9 S9^p HO ^9 fl^ 


881 


PD02807 


APOLIPOPROTEIN E PRECURSOR " 
APO-E GLYCOPROTEIN PLAS. 


PD02807E 10.90 4,702e-09 358- 
407 


882 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 


885 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.071 e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE STGNA TTfRF 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 
134-1D4 rK00372E 12.62 2. 125e- 
23 360-380 PR00372C 7.90 
3.025e-22 289-309 PR00372F 

1 1 OO d 111b. T1 One v* 1 /I 

u.uy o.333e-zl 395-414 

PR00372D 10.22 LOOOe-19329- 
348 


887 


BL00301 


GTP-binding elongation factors proteins. 


BL00301B 20.09 2.800e-24 103- 
21-33 


888 


BL00518 


Zinc finger, C3HC4 tvne (RING finder* 
proteins. 


RT OfKI 8 19 9"2 1 /£/C7a no on on 

jdLjUujio iz.Z3 i.oo /e-uy 30-3" 


889 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 


892 


BL01022 


PTR2 family proton/oligopeptide 
symporters proteins. 


BL01022B 22.19 6.016e-14 72- 

I 1R "RT 0,109917 90 <1 1 T70« 1 n 
110 DijUiUzzJti Z3.D I 1.173e-12 

472-508 BL01022A 11.58 9.135e- 

19 49-61 RT ni097T^ O /1 7 1 yicc fl 

II 199-212 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


— 894 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 1.360e-13 
012.-551 rK00z37G 19.63 9.069e- 
13 353-380 PR00237E 13.03 
7.I20M2 243-267 PR00237D 
8.94 4.150e-ll 194-216 
PR00237A 11.48 4.375e-ll 83- 
108 


896 


BL00129 


Glycosyl hydrolases family 31 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A 26.21 1.720e-25 
384-430 BL00129E 22.60 4.857e- 
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23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.89 le- 18 495-522 
BL00129F 26.19 7.545e-I5 814- 
852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PD01101 


INHIBITOR HEAVY CHAIN 
CHANNEL IN. 


PD01101B 21.53 l.OOOe-40 274- 
327 PD01101D 24.45 1.000e-40 
457-512 PD01 101A 18.25 6.268e- 
23 83-117 PD01101C 12.69 
1.237e-16 366-386 PD01101E 
6.73 7.750e-12 566-576 


900 


PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5.979e-09 31-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.116e-31 24-63 


903 


BL01115 


G TP-binding nuclear protein ran proteins. 


BL01115A 10.22 1.509e-l 1 21-65 


906 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e-10 548-581 DM00215 
19.43 4.054e- 10 550-583 
DM00215 19.43 5.339e-10 552- 
585 DM00215 19.43 7. 107e-10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 314- 
332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1 125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-ll 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e-13 197-212 


914 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962Bll.98 9.122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.39 9.769e-21 552-572 
PR00962H 13.32 2.636e-20 623- 
643 PR009621 1 1.68 9.786e-20 
692-712 PR00962E 8.81 2.91 5e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 1.000e-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C 8.00 4.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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SEQ 
ID 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








io.ijL Z.o3oe-2U 553-573 
PR009621 1 1.68 9.786e-20 622- 
642 PR00962E 8.81 2.915e-18 


916 


BL00134 


Serine proteases, trypsin family, histidine 
proteins. 


BL00134A 1 1.96 5.886e-14 90- 

1fi7 


917 


BL00478 


LIM domain nrnternQ 


226 BL00478B 14.79 6.712e-10 


918 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.729e-09973- 
988 


922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 1.000e-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM0003 IB 15.41 8.063e-09 79- ! 
111 


925 


BL00072 


Acyl-CoA dehydrogenases proteins. 


BL00072D 30.08 2.837e-24 280- 
33 1 BL00072E 24. 12 8.200e-24 
368-41 1 BL00072C 25.30 7.873e- 
2U 2zo-2o7 BL00072B 9.48 
6.049e-12 183-196 


927 


i BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- ~ j 

25o J3L0U237A 27.68 6.657e-13 

90-130 BL00237D 11.23 9.571e- 
i "3 oqa it\n 


928 


BL01033 


Globins profile. 


BL01033A 16.94 7.923e-18 25-47 
BL01033B 13.81 1.000e-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.7I4e-13 203- 
253 


932 


BL00415 


Synapsins proteins. 


BL00415N4.29 9.519e-10 353- 
397 BL004 1 5N 4.29 2.11 7e-09 
63-107 BL00415N4.29 3.628e-09 
57-101 BL00415N4.29 5.664e-09 
347-391 | 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448O13.621.000e-40 
152-189 PD02448E 13.33 9.000e- 
30 223-249 PD02448F 14.22 
y.o54e-25 267-291 PD02448D 
11.48 3.659e-18 197-211 
rl7UZ445U lU./i 7.o57e-lo 293- 
306 


934 


DM00191 


w SPAC8A4 04C RF^T^TANPF 
SPAC8A4.05C DAUNORUBICIN. 


TYIVvfflAi Q1 T\ 1*2 CSA d f\ort — in i 

x^iviuuiyiJLi l 9.Uo3e-10 136- 
175 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 4.696e- 10 67- 
111 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.000e-22 183- 
201 PR00762C 9.29 1.000e-21 
268-288 PR00762E 12.07 3.250e- 
20 520-537 PR00762D 11.29 
l.OOOe- 19 470-491 PR00762F 
15.12 1.429e-19 538-558 
PR00762B 12.12 1.818e-18 214- 
234 PR00762G 14.13 3.455e-17 
577-592 


938 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.500e-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DM01 11 IE 17.28 1.568e-10 248- 
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SEQ 
ID 

NO: 


ACCESSION 
NO, 


DESCRIPTION 


RESULTS* 






TRANSFORMING 61KPDF1. 


297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 
5.263e-09 279-325 DM01 1 1 1M 
10.67 8.674e-09 91 1-935 


940 


BL00107 


Protein kinases A TP-binding region 
proteins. 


BL00107B 13.31 1.000e-14 293- 
309 BL00107A 18.39 6.760e-13 
229-260 


: 942 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 
597 


943 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 3.500e-35 8-47 


945 


BL00989 


Clathrin adaptor complexes small chain 
proteins. 


BL00989B 26.51 L000e-40 66- 
117 BL00989A 11.66 1.000e-13 
5-19 


946 


PR00178 


FATTY ACID-BINDING PROTEIN 
SIGNATURE 


PR00178D 13.52 9.571e-09 450- 
469 


947 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 4.857e-09 713- 
724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.412e-14 201-216 


951 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.050e-10 180- 
230 


952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-ll 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 


PF00109 


Beta-ketoacyl synthase. 


PF00109 13.08 2.846e-12 342-357 


957 


PR00069 


ALDO-KETO REDUCTASE 
SIGNATURE 


PR00069A 16.01 8.826e-24 26-51 
PR00069B 11.33 L514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 


958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 
642 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328A 10.62 8.740e-10 7-31 


962 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


963 


BL00354 


HMG-I and HMG-Y DNA-binding 
domain proteins (A+T-hook). 


BL00354A 3.83 9.438e-10 1489- 
1499 


964 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 53-96 


965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 581- 
616 


966 


PR00515 


5 -HYDROXYTRYPT AMINE IF 
RECEPTOR SIGNATURE 


PR00515D 7.91 5.741e-09 13-33 


967 


BL00579 


Ribosomal protein L29 proteins. 


BL00579B 21.99 5.065e-21 164- 
194 


970 


BL00504 


Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 


BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 


973 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 
271 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F 5.86 l.OOOe- 10 242-254 


975 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.429e-22 99- 
139 


976 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL00031B 22.25 5.500e-28 94- 
126 


977 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 196-209 
PD00066 13.92 8.200e-l 6 336-349 
C PD00066 13.92 2.385e-15 476-489 
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SEQ 
* IB 

NO: 


ACCESSION 
NO. 


DESCRIPTION 


! RESULTS* 








PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.800e-14 448-461 
PD00066 13 Q9 4 £OHp l/i < 309_/in< 

PD00066 13.92 5.200e-14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e-12 420-433 
PD00066 13.92 6.870e-ll 168-181 


978 


BL00721 


Formate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 1.000e-40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL00721E 13.46 l.OOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 

91 90 R 93 Qp 70 91/1 

BL00721A 15.31 9.719e-32 287- 

498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721G7.97 
3.017e-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 ISO- - 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16 219-242 BL00869G 13.55 

9 ^Zf?A 1£ 109 91/1 13T (\f\QCCYE 

Z.j«+je-iO iSJjUUooyjr 
19 77 7 031<»-14 1 ^7 109 

BL008691 12.92 3.274e-12 242- 
270 BL00869D 14.02 5.282erl0 
95-124 BL00869B 15.55 9.382e- 

10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.89 2.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2.427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 



TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp_l 


Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tmJ 


7 transmembrane receptor (rhodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domain 


8.1e-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 


81.3 


14 


Aajrans 


Transmembrane amino acid 
transporter protein 


2.7e-42 


153.9 


15 


E1-E2 ATPase 


E1-E2 ATPase 


6.3e-124 


412.2 


16 


trypsin 


Trypsin 


1.2e-87 


278.6 


17 


*g 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin c 


Lectin C-type domain 


0.0003 


2L2 


20 


Alpha JLJucos 


Alpha-L-fiicosidase 


1.2e-217 


736.5 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


22 


pkinase 


Eukaryotic protein kinase domain 


3.3e-87 


303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


24 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.8 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


L5e-100 


347.4 


28 


spectrin 


Spectrin repeat 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domain, G-beta repeat 


1.2e-07 


38.8 


33 


rem 


RNA recognition motif. 


l.le-17 


72.2 


34 


rem 


RNA recognition motif. 


l.le-17 


72.2 


36 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


3e-36 


117.3 


37 


ank 


Ank repeat 


5.9e-25 


96.3 


38 


SRF-TF 


SRF-type transcription factor 


1.4e-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


zf-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugarjr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm__2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-100/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tm_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e~46 


147.6 


60 


Kunitz_BPTI 


Kunitz/Bovine pancreatic trypsin 
inhibito 


3.7e-47 


148.6 


62 


DAD 


DAD family 


2.5e-74 


260.3 


63 


MOZ_SAS 


MOZ/SAS family 


5.9e-133 


455.1 


64 


MOZ_SAS 


MOZ/SAS family 


1.7e-123 


423.6 


65 


ras 


Ras family 


9.3e-89 


308.3 


67 . 


Hamlp_like 


Haml family 


3.7e-49 


176.7 


68 


7tmJ 


7 transmembrane receptor (rhodopsin 
family) 


5.2e-39 


126.1 


70 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase family M41 


1.2e-110 


381.0 


72 J 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


81 


K tetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


84 


AAA 


ATPases associated with various 
cellular act 


1.3e-77 


271.3 


85 


homeobox 


Homeobox domain 


1.4e-28 


108.3 


87 


TGF-beta 


Transforming growth factor beta like 


6.7e-68 


210.2 


91 


mito_carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


95 


adenylatekinase 


Adenylate kinase 


Lle-15 


60.0 


96 


ig 


Immunoglobulin domain 


4.1e-20 


69.8 


99 


CNH 


CNH domain 


3.4e-120 


412.7 


100 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


101 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-47 


170.8 


102 


zf-C2H2 


Zinc finger, C2H2 type 


4.4e-89 


309.4 


103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 • 


97.9 
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SEQID 
NO: 


PFAM NAMF 

*- * rXlTM. I /VIVA Hi 




p-value 


PFAM 
SCORE 


112 


HSP20 


Hso20/alnha rrv^taJJm famiiv 

1 '■•J^^-yJI CLl^JllCL jfoLtll (11 1 ICUXlIiy 


O OA 

z.oe-zu 


77.7 


115 


EF TS 


Elongation factor TS 


j.oe-o3 


OO 1 1 

ZZ 1.1 


116 


sugar_tr 


Sll2ar ( and others transnnrfrM* 


4e-oi 


OO""? 1 

223.1 


118 


catalase 


Catalase 


n 


1 1 CO A 

1 158.9 


119 


UCH 


Ubiauitin carboxvUterminal 

**** ^ »-*-*■ till vui U vAj 1 Lvl 11XJJLLC11 

hydrolase, famil 


1 /» 1 o 


O A A 

24.4 


122 


metalthio 


Metallothionein 


2.8e-25 


97.4 


125 


adh short 


short chain dehydrogenase 




1 £.A C 

104.0 


126 


KRAB 


KRAB box 


7 Oa *?c. 


AC A 


127 


G-alpha 


G-orotein aloha suhnnit 


i e-z*jy 


o43.0 


128 


mito carr 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


EF1BD 


j-h. j. guanine iiuuicuiKAC CAt/Judllgc 

domain 


4.ye-53 


189.6 


132 


GYF 


GYF domain 


^f.^e-zo 


106.6 


133 


GYF 


GYF domain 


4.9e-28 


106.6 


134 


lipocalin 


binding pr 


z.le-33 


119.1 


135 


pkinase 


EulcarVnttn nrotf*in Irinncia Hrvm^ai-n 


j.^e-oo 


299.8 


136 


ank 


Ank rprjMf 


Z.2e~29 


111.1 


137 


EL8 


Small rvtoVinpQ 
rintecrine/chpmnk'inf*^ intpr 


1 A 1 O 

i.le-18 


65.2 


139 


pyridoxal_deC 


Pvridoxal-fienpnHpnt HpmrhAY\/1c»c» 

conse 


A AAA! 1 


19.0 


140 


cadherin 


Cadherin domain 


1.3e-88 


307.8 


142 


efhand 


EF hand 


j. /e-33 


123.0 


143 


Acyltransferase 


Acvltran^fpra^f* 


Oa OA 

ze-zy 


1 1 1 .2 


146 


cytochrome c 


Cytochrome c 


i. /e-33 


124.7 


147 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 


300.3 


148 


PDZ 


* *'« uuiiiain ^/viso Known as L/rjjtv or 
GLGF). 


1 .7e-09 


45.0 


149 


aldo ket red 


Aldo/keto reductase family 


7.4e-189 


640.8 


150 


homeobox 




1 0«. AO 


38.7 


151 


PseudoU synth 
1 


wxi^xT. |jdcu.u.ciuiiujLue byoiudSc 


4.7e-57 


203.0 


152 


abhydrolase 


aloha/beta hvdrola<?p fnlH 


i./e-Ji 


118.0 


153 


PDZ 


PDZ domain ( Akn tnnwn nc FitTP nr 

A **v/uicixu ^iiloy JVJJ.LJ Wl_l do L/fl fx, OF 

GLGF). 


i.ie-uy 


45.6 


156 


PHD 


PHD-fmger 


7.6e-15 


62.8 


157 


fh3 


Fibronectin tvne TTT H Amain 




21.9 


158 


homeobox 


Homeobox domain 


z. /e-z/ 


] 04. 1 


160 


PWI 


PWI domain 


2 Qa Oyl 


93.6 


162 


DnaJ 


DnaJ domain 


2e-06 


34.8 


164 


Cbl_N 


CBL Drntn-onrncrptip XT-t^rm in ^ 1 

> — ' V iaj VWiwU iiCilC IN LCI UU.JJJ.cil 

domain 


oe-1 17 


401.5 


166 


metalthio 


Metallothionein 


5 1 a OiC 


100.6 


167 


LRR 


Leucine Rich Repeat 


0.00069 


26.3 


169 


fibrinogen C 


a iui luw^cii uctd. tuju gdmma cnains, 
C-term 


C 1 ^ 1 OA 

5.3e-180 


611.4 


170 


fibrinogen C 


HhrinnO"f*Tl H<*ta JiTlfl trgmmo r*Vioivin 
x iuj imjg&ii L/C/U» cuiu. g<iinri1<i LllaUlS, 

C-term 


C 1*. 1 OA 


611.4 


171 


fibrinogen_C 


•* Ls wLCl (UiU g£UilJXl£l UJJ.cllJJ.iij 

C-term 


1 o 1 /1Q 

Ie-J4y 


510.8 


173 


homeobox 


Homeobox domain 


1 C- OA 

l .3e-zy 


111.6 


174 


FYVE 


FYYE zinc finger 


7.4e-28 


103.8 


175 


GRIP 


GRIP domain 


3.9e-08 


40.5 


182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


li3C domain 


2.2e-50 


180.8 


187 


TBC 


TBC domain 


2.2e-50 


180.8 1 
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SEQID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCOBE 


188 


PDZ 


PDZ domain (Also known as DHR or 


4e-13 


57.0 




ivelcn 


Kelch motif 


5.2e-106 


365.6 


1 Qn 


Tropomyosin 


Tropomy os ins 


3.8e-171 


CI C A 

535.4 




Rieske 


KiesKe [zre-zoj domain 


0.0016 


18.5 


199 


ig 


Immunoglobulin domain 


5.9e-19 


66.1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 


203 


trefoil 


Trefoil (P-type) domain 


le-24 


95.5 


204 


TBC 


TBC domain 


8.5e-38 


139.0 


205 


efhand 


EF hand 


0.0096 


22.6 


206 


ISK_Channel 


Slow voltage-gated potassium 
channel 


0.0031 


8.1 


207 


tretoil 


lretoii (P-type) domain 


2.9e-48 


173.7 


209 


Ribosomal M3 


Ribosomal protein S 1 3/S 1 8 


1 .2e-78 


274.7 


210 


hemopexin 


Hemopexin 


1.3*62 


221.5 


213 


TBC v 


TBC domain 


2.5e-48 


174.0 


215 


Basic 


Myogenic Basic domain 


4.3e-50 


179.8 


216 


Ribosomal_L24 


KOW motif 


8.2e-23 


89.2 


222 


fh3 


Fibronectin type III domain 


7.3e-141 


481.4 


223 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EF hand 


6.1e-06 


33.2 


225 


Pterinjta 


Pterin 4 alpha carbinolamine 
dehydratase 


9.3e-42 


152.1 


228 


ABC tran 


ABC transporter 


4.1e-110 


379.2 


234 


El_DerP2_DerF 
2 


El family 


3.7e-90 


312.9 


235 


ElJDerP2_DerF 
2 


El family 


1.6e-48 


174.6 


237 


PMP22 Claudin 


PMP-22/EMP/MP20/Claudin family 


1.7e-25 


98.1 


238 


Opiods_neurope 
P 


Vertebrate endogenous opioids 
neurope 


1.8e-159 


543.2 


239 


eIF-5a 


Eukaryotic initiation factor 5A 
hypusine 


5.9e-104 


358.8 


240 


Amino_oxidase 


Flavin containing amine oxidase 


2.5e-ll 


37.8 


243 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-99 


343.6 


244 


Band 7 


SPFH domain / Band 7 family. 


2.3e-53 


190.7 


245 


ank 


Ank repeat 


1.6e-88 


307.5 


246 


zf-C2H2 


Zinc finger, C2H2 type 


6.7e-49 


175.9 


247 


actin 


Actin 


2.3e-42 


140.3 


248 


ER_lumen_recep 
t 


ER lumen protein retaining receptor 


2.4e-155 


529.5 


OCA 

250 


rMP22_Claudin 


PMP-22/EMP/MP20/Claudin family 


2.2e-38 


140.9 


252 


Collagen 


/~i 1 1 , •111* i fr\ f\ 

Collagen tnple helix repeat (20 
copies) 


1.4e-13 


58.6 


255 


C2 


C2 domain 


0.052 


7.8 


257 


CAP GLY 


CAP-Gly domain 


L4e-20 


8L8 


260 


WD40 


WD domain, G-beta repeat 


9.9e-62 


218.5 


261 


WD40 


WD domam, G-beta repeat 


9.9e-62 


218.5 


262 


WD40 


WD domam, G-beta repeat \ 


9.9e-62 


218.5 


263 


cofihn_ADF 


Cofilin/tropomyosin-type actin- | 
binding pr 


7.8e-21 


82.6 


264 


■0*1 _ 1 T t A 

RibosomalJL14 


Ribosomal protem L14p/L23e 


9.2e-10 


40.6 




C ADA 


Saposin A-type domain 




1 /T3 A 


266 


SAPA 


Saposin A-type domain 


4.4e-27 


103.4 


267 


ABC_tran 


ABC transporter 


9.5e-39 


142.2 


269 


RibosomalJL14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 * 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 
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SEQH) 
NO: 

273 



276 
277 



279 



282 



287 



289 



293 
295 



296 



297 



298 



299 
301 



302 



1307" 



308 



r JlO~ 



311 



1312 



314 



325 



329 



330 



"332" 
337 



340 



343 



346 
347 



348 



351 



353 



354 
360 



362 



365 



366 
368 



369 



370 



PFAM NAME 



PCT7US01/04098 



rrm 



lipocalin 



DESCRIPTION 



RNA recognition motif. 



WD40 



G-patch 



Antiproliferat 



KRAB 



7tm_3 
SET 



Lipocalin / cytosoiic fatty-acid 
binding pr 

Has family ~~ 



Ubiquitin carboxyl-terminal 
hydrolase, famil 



START domain 



WD( 



domain, G-beta repeat 
G-patch domain 



BTG1 family 



KRAB box 



7 transmembrane receptor 



Pyridox^oxidase 



rrm 



Ubie_methyltran 



Ubiejnethyltran 



Cytjreductase 



G-patch 



7tm 1 



SET domain 



Pyridoxamine 5 f -phosphate oxidase 



RNA recognition motif." 
ubiE/CQQ^ methyltransferase family 



ubiE/COQ5 methyltransferase family 



— ^viujiuHiijtmac ia± 

FAD/NAD-binding Cytochrome 
reductase 



G-patch doma in 



p-value 



0.074 



2.5e-41 



l.le-67 



1.2e-147 



3.2e-09 



1.8e-27 



7.8e-22 



1.2e-101 



7.1e-21 



3.3e-73 



5e-30 



1.3e-76 



PFAM 
SCORE 



146.4 



503.9 



44.1 



104.7 



86.0 



351.0 



82.8 



256.6 



113.2 



5.4e-45 



6.3e-05 



0.0024 



7.7e-61 



PH 



7tm 1 



Rhodanese 



tubulin 



SURF4 



IMS 



7 transmembrane receptor (rhodopsin 
family) 



PH domain 



7 transmembrane receptor (rhodopsin 
family) 



Rhodanese-like domain 



TubuIin/FtsZ family 
SURF4 family 



cadherin 



NAC 



IP trans 



THIS 



zf-C2H2 



AIRS 



annexin 



Stathmin 



Ribosomal L16 



lactamase B 



impB/mucB/samB family" 



3.1e-14 



7.7e-43 



0.0015 



1.4e-84 



3.3e-64 



4.9e-286 



L2e-199 



Cadherin domain 



NAC domain 



Phosphatidylinositol transfer protein 



Transcription factor S-II (TFIIS) 



Zinc finger, C2H2 type 



AIR synthase related proteST 



Annexin 



2e-58 



4.3e-91 



2.1e-28 



6.5e-98 



8.8e-05 



3.6e-61 



4e-32 



Stathmin family 



Ribosomal protein LI 6 



4.6e-80 



efhand 



lectin 



WD40 



Acetyltransf 
tRNA-synM 



Metallo-beta-lactamase superfamily 



EF hand 



Lectin C-type domain 



WD domain, G-beta repeaT 



Lipocalin / cytosoiic fatty-acid 
binding pr 



Sulfatase 



pkinase 



ACBP 



Acetyltransferase (GNAT) family" 



tRNA synthetases class I (I, L, M and 



Sulfatase 



START domain 



Eukaryotic protein kinase domain 



Acyl CoA binding protein 



I.8e-90 



4.6e-09 



0.012 



2.5e-14 



1.3e-05 



2.2e-18 



268.0 



162.9 



-96.3 



-118.1 



215.5 



60.7 



T38T* 



17.8 



270.8 



226.7 
963.6 



676.6 



207.5 



316.0 



107.8 



338.7 



29.3 



216.6 



1202 



279.4 



314.0 



34.9 



-6.0 



61.0 



32.1 



6.3e-10 



0.0019 



4.6e-185 



74.5 



38.3 



24.9 



628.2 



6.1e-228 



3.8e-ll 



2.4e-10 



4.4e-56 




770.6 
50.5 



41.3 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


354 


KasUbr 


Kasubr domain 


O 1 ^ An 

o.le-43 


155.7 


3o-> 




i t>c domain 


U.U17 


-DO. 6 


389 


Glycos_transf_2 


Glycosyl transferases 


1.3e-15 


65.3 


39U 


Ma ua .bx 


Sodium/calcium exchanger protein 


3.9e-105 


362.7 


391 


tn3 


Fibronectin type III domain 


4.1e-102 


352.6 


392 


m3 


Fibronectin type HI domain 


3.4e-45 


163.6 


393 


m3 


Fibronectin type III domain 


3.4e-45 


163.6 


394 


lal_recept_b 


Low-density lipoprotein receptor 
repeat 


7.1e-49 


175.8 




KiDosomaijLou 


KiDosomal protein L3Up/L7e 


0.UU23 


16.0 


396 


OxysterolJBP 


Oxysterol-binding protein 


1.5e-94 


327.5 


397 


RDS_ROMl 


Peripherin/rom- 1 


2.9e-33 


123.9 


399 


lactam ase_B 


Metallo-beta-lactamase superfamily 


3.4e-39 


143.6 


402 


F-box 


F-box domain. 


0.0002 


28.1 


403 


CLP_protease 


Clp protease 


4.8e-64 


226.2 


A f\C 

405 


Ribosomal_L35 
Ae 


Ribosomal protein L35Ae 


6e-77 


269.0 


406 


JL1M 


LIM domain containing proteins 


0.00021 


20.7 


A 1 A 

410 


tKNA-synt_ic 


tRNA synthetases class I (E and Q) 


le-236 


799.8 


a 1 1 
41 1 


X !' I ' 1 1 i UJ , ,.,.f O 

In IP transr 2 


Nucleotidyltransferase domain 


3.9e-16 


67.0 


A 1 O 
412 


TNT? A T\ 

DEAD 


DEAD/DEAH box helicase 


0.00016 


17.2 


414 


t\t Ten a 


Domain of unknown function DUF94 


0.00011 


26.9 


415 


tubulin 


Tubulin/FtsZ family 


4.5e-289 


973.7 


420 


SET 


SET domain 


3.3e-57 


203.5 


421 


WD40 


II JTV J • a*"** 1 j . 

WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-39 


144.9 


424 


pkinase 


Eukaryotic protein kinase domain 


8.9e-75 


261.8 


428 


LIM 


LIM domain containing proteins 


1.8e-34 


126.7 


431 


kazal 


Kazal-type serine protease inhibitor 
domain 


3.7e-18 


73.8 


432 


bH2 


Src homology domain 2 


1.4e-67 


198.4 


/t -5 "3 

433 


ZI-C2H2 


Zinc finger, C2H2 type 


2.8e-144 


492.7 


A*3 A 

434 


ras 


Ras family 


0.012 


-106.8 


Aid 

436 


bl-b2 A 1 rase 


E1-E2 ATPase 


1.6e-117 


391.0 


437 


RNA_pol_A 


RNA polymerase alpha subunit 


0 


1077.7 


438 


PHD 


PHD-finger 


1.6e-ll 


51.7 


439 


lectin_c 


Lectin C-type domain 


4.7e-30 


113.3 


440 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-65 


231.6 


441 


arrestin 


Arrestin (or S-antigen) 


2.9e-254 


858.1 


442 


aminotran_3 


Aminotransferases class-Ill 
pyridoxal-pho 


8.2e-80 


231.1 


A A*5 

443- 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases famil 


8.5e-12 


52.6 


AAA 
444 




Uir/JNr-l iamuy 


2.6e-277 


934.6 


4 J 1 


i-DOX 


1-DOX 


3.8e- 1 17 




4jJ 


Rieske 


KiesKe [2re-2bJ domain 


2.6e-13 


57.7 


4j4 


Zt-C2rl2 


Zinc linger, C2H2 type 


3.9e-64 


226.5 


456 


homeobox 


Homeobox domain 


2.8e-08 


38.9 


4jy 


ig 


Immunoglobulin domain 


2.6e-20 


70.5 


460 


Hydrolase 


haloacid dehalogenase-like hydrolase 


4e-25 


96.9 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


Calponm homology (CH) domain 


2.4e-17 


71.1 


467 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


40o 


oterol_desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro_isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






GLOr). 






c c c 

555 


WW 


WW domain 


1 .3e-24 


AC O 

95.3 


ceo 
558 


kinesin 


Kinesin motor domain 


to 1 —!C 

1.8e-176 


CAA T 

599.7 


C CO 

559 


ZI-C3HC4 


Zmc ringer, C3HC4 type (KING 
finger) 


r\ r\t\f\ o c 

0.00085 


1 c c 

16.5 


5o3 


elnand 


br hand 


7.9e-l l 


49.4 


567 


PH 


PH domain 


7.8e-06 


25.9 


568 


PH 


T\TT .1 - . - _ * 

PH domam 


3.1e-39 


143.8 


569 


Hist_deacetyl 


Histone deacetylase family 


5.2e-106 


365.6 


C7A 

570 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


3.4e-20 


80.5 


571 


ZI-U3HL4 
— — 


Zmc ringer, C3HC4 type (RING 
finger) 


le-16 


58.5 


J / 3 


ubiquitin 


Ubiquitin family 


1 A ~ AO 

1.4e-08 . 


31.1 


574 


FH2 


Form in Homology 2 Domain 


1.3e-110 


380.9 


576 


serpin 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoGAP domain 


4.4e-53 


189.8 


582 


Ribosomal JL7A 
e 


Ribosomal protein L7Ae 


0.028 


1.0 


584 


kazal 


Kazal-type serine protease inhibitor 
domain 


2.2e-52 


187.4 


585 


LRR 


Leucine Rich Repeat 


4.4e-28 


106.7 


586 


PHD 


PHD-finger 


3.8e-12 


53.8 


588 


GTP1_0BG 


GTP1/OBG family 


l.le-62 


215.2 


590 


Collagen 


Collagen triple helix repeat (20 
copies) 


8e-42 


152.4 


591 


lys 


C-type lysozyme/alpha-lactalbumin 
family 


1.6e-31 


116.4 


596 


A /~t"T> TT 

ACBP 


Acyl CoA binding protein 


0.0022 


-9.4 


597 


CXTT?0 XT 

SNF2_N 


SNF2 and others N-terminal domain 


3.7e-98 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


T ' Tt * 1 Tt i 

Leucme Rjch Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


610 


cpn60_TCPl 


TCP-l/cpn60 chaperonin family 


1.7e-237 


802.4 


613 


THF_DHG_CY 

TT 

H 


Tetrahydrofolate 

dehydrogenase/cyclohydro ! 


4.9e-173 


588.3 


617 


rrm 


RNA recognition motif 


4e-14 


60.4 


618 


rrm 


RNA recognition motif. 


4e-14 


60.4 


620 


cofilin_ADF 


Cofilin/tropomyosin-type actin- 
binding pr 


3e-06 


34.2 


o/l 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


DZZ 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


5.8e-21 


83.1 




Zl-CzrLi 


Zmc nnger, C2H2 type 


2.5e-124 


426.4 




TYC A T\ 

UJbAU 


DEAD/DEAH box heucase 


2.5e-68 


219.0 


03z 


vjol 


Glutathione S-transferases. 


4.8e-26 


89.0 


033 


5_ nucleotidase 


S'-nucleotidase 


6.6e-248 


837.0 


o3o 


T TAA" 
LiM 


LIM domain containing proteins 


l.6e-88 


307.5 


03/ 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


038 


MSP_domain 


MSP (Major sperm protein) domam 


8.4e-09 


42.7 


VJ7 


metaiiuio 


Metallothionein 


Ze-z4 


94.0 


641 


zf-C2H2 


Zinc fmger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e«48 


172.1 


643 


Ribosomal S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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SEQID 
NO: 



648 



pfSm name 



Lipase_GDSL 



JJCSUUPTION 



PCT/US01/04098 



-Lipase/Acylhydrolase with GDSI7 
like motif 



p-vaiue 



0.015 



PFAM 
SCORE 
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SEQID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






finger) 






/Hy 


mito carr 


Mitochondrial carrier proteins 


4.5e-o7 


232.8 


/ jU 


r\r rim 


Domain of unknown function DUF27 


A C ^ 1 O 

4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 




riiVlCj DOX 


HMG (high mobility group) box 


8.6e-13 


55.9 


7j3 


CDDV 

SrKY 


SPRY domain 


5.9e-05 


23.3 


Ts.A 


ulr CL/C 


Cell division protein 


7.5e-153 


521.2 


/JJ 


mito carr 


Mitochondrial carrier proteins 


3e-88 


305.4 


/JO 


1 5>rN 


Thrombospondin N-terrninal -like 
domains 


8.1e-58 


205.5 


fj I 


DTD 


B1B/POZ domam 


5.7e-23 


89.7 




~C /T) TJO 

ZI-C2Jti2 


Zinc ringer, C2H2 type 


L2e-12 


55.4 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal Sl4 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 


ThiF_family 


ThiF family 


1.7e-39 


144.6 


766 


DnaJ 


DnaJ domain 


3.9e-36 


133.5 


768 


tRNA-synwb 


tRNA synthetase class II 


9.1e-81 


281.7 


769 


ldl_recept_a 


Low-density lipoprotein receptor 
domain 


0 


1404.5 | 


770 


WD40 


WD domain, G-beta repeat 


2e-21 


84.6 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


SNF2 and others N- terminal domain 


5.5e-99 


342.3 


776 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


111 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


778 


VPS9 


Vacuolar sorting protein 9 (VPS9) 
domain 


l.le-30 


115.4 


779 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-08 


3L0 


781 


cadherin 


Cadherin domain 


5.6e-113 


388.7 


783 


HECT 


HECT-domam (ubiquitin- 
transferase). 


4.2e-31 


116.8 


785 


sushi 


Sushi domain (SCR repeat) 


L8e-60 


214.3 


7o6 


sushi 


Sushi domam (SCR repeat) 


1.8e-60 


214.3 


TOO 

788 


vwa 


von Willebrand factor type A domam 


1.9e-52 


187.7 


TAA 

790 


rrm 


RNA recognition motif. 


2.8e-20 


80.8 


791 


Collagen 



Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


TOO 

ly± 


pkinase 


Eukaryotic protein kinase domain 


0.023 


12.4 


/so 


Zl-UZHZ 


Zmc finger, C2H2 type 


6.5t-95 


328.7 


/yo 


adh_short 


short chain dehydrogenase 


4.1e-05 


-7.3 


-7QQ 

lyy 


oAiCAK_synt 


SAICAR synthetase 


6e-125 


428.5 


QAC 


WD4U 


WD domam, G-beta repeat 


4e-65 


229.8 


oUO 




ZU5 domain 


4.7e-37 


136.5 


oU/ 


WJJ4U 


WD domain, G-beta repeat 


0.016 


21.8 


rap. 


WD4U 


WD domam, u-beta repeat 


0.0041 


23.8 


809 


pkinase 


Eukaryotic protein kinase domain 


2e-31 


117.2 


olO 


vwa 


von Willebrand factor type A domam 


1.9e-52 


187.7 


814 


zf-C2H2 


Zinc finger, C2H2 type 


4.5e-83 


289.4 


815 


zf~C2H2 


Zinc finger, C2H2 type 


6e-74 


259.3 


817 


myosinjiead 


Myosin head (motor domain) 


1.5e-176 


599.9 | 


O 1 O 

818 


GSPII_E 


Bacterial type II secretion system 
protein 


0.012 


11.5 


8I9 


PDEase 


3'5-cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


82 1 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


CNH domain 


0.00015 


-24.7 


827 


rrm 


RNA recognition motif. 


1.5e-06 


35.2 
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SEQID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


yU4 


Armaanio_seg 


— : — 

Armadilio/beta-catenin-Iike repeats 


1 . i e-uo 


J J.O 


yuo 




roniiiB nomoiOKy z ijomain 


4.3e-l 1Z 


IKS 7 
JOJ. / 


yu / 


uyuQyiyiiransi 


^ynuyjyitransierase 


1 .4e-U3 


OO 1 


on r 
yuo 


pKinase 


jauKaryonc protein Kinase domain 


1 Oo M 

i .ze-04 




ono 
yuy 


pKUlaSe 


tj/UKaryouc proiein Kinase domain 


o.je- /u 




oin 

y 1U 




nuKaryoiic protein Kinase oomain 


z.ye-4z 




01 1 

yi i 


pkinase 


Eukaryotic protein kinase domain 


1 


13 l.o 


Q10 
y 1Z 


rrLU 


r JtiiJ-imger 


j. ie-uo 




01*3 


rnJJ 


jrriL/-ijnger 


j.jc-10 


00. D 


y 10 


filament 


Intermediate filament proteins 


O To 1 Ol 

y. /e-izi 


414.J 


917 


LIM 


LIM domain containing proteins 


5.9e-15 


57.9 


Ol 0 

yi& 


oAM 


oAM domain (otenie aipna motii) 


4.ie-lo 


O0.9 


yZZ 


Acylphosphatase 


Acylphosphatase 


2.9e-63 


223.6 


y/4 




Immunoglobulin domain 


1.3e-08 


32.8 


925 


Acyl-CoA_dh 


Acyl-CoA dehydrogenase 


2.4e-131 


449.8 


aoo 

927 


7tm_l 


7 transmembrane receptor (rhodopsin 
tamily) 


2.9e-45 


145.9 


yZo 


globin 


Cjlobm 


2.4e-52 


1 0£. A 

186.9 


929 


sugarjr 


Sugar (and other) transporter 


1.2e-16 


68.8 


932 


Collagen 


Collagen triple helix repeat (20 
copies) 


0.00097 


9.7 


yi j 


rlMvj DOX 


rUVlu (nign mobmty group) box 


7.8e-34 


125.8 


y^4 


CP A 


oxiA domain 


0.0021 


24.7 


yjj 


ras 


Ras family 


o.4e-59 


1AA O 

209.2 


y^o 


Uxi 


Calponin homology (CH) domain 


3.8e-21 


83.7 


y3 / 


Yoitage__u.LC 


Voltage gated chloride channels 


1.9e-199 


676.0 


yoo 


homeobox 


Homeobox domain 


1.9e-25 


98.0 


y4o 


p kinase 


Eukaryotic protein kinase domain 


9.9e-58 


oac n 

205.2 ! 


y4Z 


Myosin_tail 


Myosin tail 


O T~ Aft 

3.7e-09 


38.2 j 


y4J 


Zl-UZriZ 


Zinc nnger, C2H2 type 


2.2e-92 


320.3 


y4j 


ciat_adaptor_s 


Clathrin adaptor complex small chain 


1.3e-7o 


268.0 


946 


sugarjr 


Sugar (and other) transporter 


0.017 


-122.8 1 


947 


tRN A-synt_l e 


tRNA synthetases class I (C) 


0.00097 


15.6 


945 


T)TTT\ 

PHD 


PHD- finger 


2.2e-17 


71.2 


API 

951 


sugar_tr 


Sugar (and other) transporter 


A AAO 

0.0082 


-113.9 


n.co 

952 


mito_carr 


Mitochondrial carrier proteins 


1.7e-54 


189.7 


953 


,L T"Vh.T A 

myb_DJNA- 
binding 


Myb-like DNA-binding domain 


4.5e-20 


80.1 


0*< 

yjj 


ketoacyl-synt 


— r 

Beta-ketoacyl synthase 


7.1e-I33 


AC A O 

454.8 


957 


aldo ket red 


Aldo/keto reductase family 


1.5e-98 


340.8 


oca 
y59 


Kelcn 


Kelcn motir 


A AO 

0.02 


OA O 

20.8 


yoi 


ras 


Ras family 


2.2e-29 


111.1 


yo4 


homeobox 


Homeobox domain 


5.4e-22 


86.5 


yoD 


T>TJ 

rri 


"OUT jj , , . . . ^ ~- 

rrl domain 


3e-21 


OA A 

80.9 


yoo 


Zt-L/jrlL/4 


z.inc imger, C3HC4 type (.KINO 
finger) 


O O ^ AA 

2.2e-U9 


*? A H 

34.7 


967 


Ribosomal_L29 


Ribosomal L29 protein 


1.6e-15 


65.0 


y /u 


r Au_u in aing_Z 


FAD binding domain 


8.9e-47 


166.6 


on i 
y 71 


rve 


Integrase core domain 


A AAA 1 C 

0.00015 


19.8 


y/2 


Glycos_transf_2 


Glycosyl transferases 


2.1e-21 


84.5 


y74 


Ribosomal_L10 


Ribosomal protein L10 


3.3e-48 


173.6 


975 


7tm_l 


7 transmembrane receptor (rhodopsin 

Lcxxiiiiy-j 


1.6e-37 


121.3 


976 


zf-C4 


Zinc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


zf-C2H2 


Zinc finger, C2H2 type 


6.6e- 150 


511.4 


978 


FTHFS 


Formate-tetrahydrofolate ligase 


0 


1367.2 


982 


Renal_dipeptase 


Renal dip'eptidase 


L3e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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SA 


1 AOO 


oooo 

ZUZZ 


3UU0 


ooo'/rno c£ 
7o7ClF2 5o 


OCft 1 

7591 




1 A3Q 

iu3y 


OA03 


5AA7 
JVV / 


C7 

/o/Ulr2 5/ 


0£AA 

7600 


SA 

jo 


1 (\AC\ 


7O04 


jUUo 


707^100 CC 

/o/UlrZ jo 


7/CA/l 

/oU4 


c 7 


1 r\A i 
1 Uh I 


0075 


^uuy 


707/^TDO CO 

/o/UirZ 5y 


7/C1 O 

7612 


JO 


10A7 
IU4Z 


707A 
ZUZO 


jUIU 


757PTDO /CO 


7/C1 0 

7613 




1 043 


0077 
zuz / 


301 1 
JVI 1 


707/TDO A1 


7/C1 C 

7615 


ao 




7O0£ 
ZuZo 


3010 
. JUlZ 


7C7/^TDO AO 

/o/ClrZ OZ 


*7/C 1 /C 

7616 


U 1 


1 A/1C 


7070 
ZUZ7 




757^1^0 A3 


7/C 1 7 
/61 / 


A7 

OZ 


104A 


7030 

ZUjU 


3014 


707r , Tt>O A/l 

/5/L/lrZ 04 


7623 


A3 


1047 
1U4 / 


OOT1 
ZU3 1 


301 ^ 


7C7OTD0 AC 

/o /L/JLrZ__05 


OiTOC 

7625 


A4 


1048 
lUfo 


0030 
ZUJZ 


lOI 6. 


7C7/TDT 

/o/L^JJrZ_oO 


O/TOC 

7625 


A** 


1040 


00*3 "3 


301 7 
OUi / 


787PTD0 A7 


OiTOft 

7630 


AA 


io^o 


003/1 
ZUJ4 


lOl C 

^Ulo 


7C7^TT50 /CO 

/o/UJLrZ oo 


o^o o 

7638 


A7 


1 051 


OAK 


301 o 

JUiy 


/o/L/lrZ_0y 


O/T/l A 

7640 


Oo 


1057 
IUjZ 


003A 

ZUJo 


lOOA 

jUZU 


787PTD7 70 

7o/L/Ji , Z_/U 


O^OA 

7670 


AQ 

oy 


1 053 
1UD3 


oon 
ZU3 / 


1 AOI 

3021 


7o7Ulr2_71 


o^o^ 

7676 


70 
/U 


1 054 


OO30 
ZUJO 


1 AOO 

3UZZ 


7C7/TDO TO 

7o/Ulr2 72 


7688 


71 


i 055 

1UDD 


OOIO 

Zv5y 


OAoa 
3023 


T07PTD1 TO 

7o7CJLP2 73 


7690 


79 
/Z 


1 A5A 


OA/1 A 


3024 


70ir»TTy» *7/i 
7o7CJLP2_74 


7700 


73 


1 057 


OA/1 1 

ZU41 


O AOC 

JU25 


707PTTM OC 

7o7L/JLP2_75 


7774 


1A 


1 050 

1U55 


OA/IO 

ZU4Z 


■2 AOX 

3026 


7o7CiP2_7o 


7784 


75. 


1 oco 


OA/15 

ZU4J 


3027 


787CXP2_77 


7785 


7A 
/O 


1 OAO 
1U0U 


OA/1/1 

Z044 


O ftOO 

3028 


787CIP2_78 


7792 


77 


1 O/Cl 


2045 


OftOft 

3029 


787CIP2 79 


7798 


7B 

' O 


1 o/co 
1U6Z 


204o 


O AO A 

3030 


787CIP2_80 


7807 


TO 

/y 


1U63 


2047 


3031 


787CIP2_8 1 


7810 


oU 


1U64 


2048 


3032 


787CIP2 82 


7812 


ol 


1 A/CC 

1U65 


OA/1 ft 

2049 


3033 


787CIP2 83 


7816 


5Z 


1 A£/C 


2050 


3034 


787CIP2_84 


7826 


C3 
OJ 


1 A£7 

lUo / 


OAC 1 

2051 


"5 AO C 

3035 


787CIP2 85 


7842 


8/1 


1 A/C 0 


Oft CO 

2052 


3036 


tot orm o /" 

787CIP2 86 


7850 


OC. 
OJ 


1 A£Q 


2053 


3037 


O00/~»TT*0 OO 

787CJP2_87 


7865 


CA 
50 


1 A7A 

1U /u 


2054 


3038 


7S7CEP2_88 


7882 


07 


1 A71 

iU/I 


ZU55 


O AO A 

3039 


ooo/OTTyi on 

* 7S7CIP2_89 


7891 


oo 


1 A70 

1U/Z 


2U50 


O ft /I A 

3040 


787CIP2_90 


7892 


CO 

oy 


1 A73 

1U/3 


ZU5 / 


Oft/1 1 

3041 


787CIP2__9J 


7896 


on 
yu 


1 A7/1 

1U/4 


OACP 

205o 


Oft /IO 

3042 


TOTHTTVI AO 

787CIP2 92 


7896 


01 

y i 


1 A7C 
IV I J 


OACO 

205y 


Oft/IO 

3043 


787CIP2 93 


7907 • 


07 

yz 


1 A7/C 

IU/0 


OA/CA 
ZUOU 


OA/4/1 

3U44 


787CIP2_94 


7913 


03 

yj 


1 A77 


OA/J 1 

ZUol 


Oft/I c 

3045 


OOO/^TTkO AC 

787CIP2_95 


7914 


QA 


1 A7C 

lU/o 


OA^O 

2Uo2 


Oft/1^ 

3046 


787CIP2 96 


7915 


Q5 

y.> 


1 A7Q 

iu /y 


ZUOi 


OftXO 

3047 


787CIP2 97 


7920 


OA 

yo 


1 ACO 


ZU04 


O A/l O 

3048 


OOO/^TTIO AO 

787CIP2_98 


7921 


07 

y / 


1 AC 1 

J vol 


OA/CC 

ZU05 


Oft/I A 

3049 


787CIP2 99 


7924 


08 

yo 


1 ACO 
1U8Z 


OA££ 

ZUOO 


OftCA 

3U5U 


787CIP2_100 


7927 


00 

yy 


1O03 




O AC 1 


OOO/^TT^O 1 Al 

7o7CIP2 101 


OAOA 

7929 


i nn 
ivv 


1U54 


OAAC 
ZUOO 


O ACO 

3052 


OOO/OTOO 1 AO 

787CIP2_102 


7937 


im 

1U1 


1 A05 


OAAO 


OACO 

3053 


OOO/^TAO 1 AO 

787CIP2 103 


Oft A A 

7940 


107 
1UZ 


1 ACA 
lUoO 


OA7A 

ZU IK) 


O AC/1 

3054 


O0O/^TT»0 1 /\A 

787CIP2_104 


7942 


103 

1U3 ! 


1 AC7 
lUo / 


0071 
ZU / 1 


1 ACC f 

3055 


TOnPTTO 1 AC 

787CIP2 105 


Oft/I A 

7944 


1 0/1 


1 ACO 

lUoo ■ 


OA70 

ZU /Z 


3056 


787C3P2 106 


7951 


1 05 
1U5 


1 ACO 

lUoy 


OA7Q 

ZU 15 


O AC7 

3057 


787CIP2 107 


7951 


10A 

IVO 


1 noo 
luyu 


0074 
ZU /4 


3ACC 

3U5o 


ooo/ottio i ao 

7o7CUP2_108 


7962 


i m 
1U7 


1 AOI 

iuyi 


2075 


3059 


787CIP2 109 


7964 


108 


1007 




30 AO 


7C7PTD0 1 IO 


7077 

ly 1 1 


109 


1093 


2077 


3061 


787C1P2_111 


7978 


110 


1094 


2078 


3062 


787CIP2 112 


7980 


in 


1095 


2079 


3063 


787CIP2 113 


7982 


nrr2 


1096 


2080 


3064 


787CIP2 114 


8000 


r 113 


1097 


2081 


3065 


787CIP2_1 15 


8003 
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115 



116 
117 



118 



120 



121 



122 



1099 



1100 



1101 



1102 



1103 



1104 



1105 
1106 



2083 



2084 



3066 



3067 



2085 



2086 



2087 



2088 



2089 



1116 
1117 



3068 
3069 



3070 



3071 



3072 



3073 



787CIP2 116 



787CIP2 117 



787C1P2 118 



787CIP2 119 



787CIP2 120 



8004 



8007 



8008 



8009 



787CIP2 121 



787CIP2 122 



3074 
3075 



787CIP2_123 
787CIP2 124 



787CIP2_125 



787CIP2_126 
787CIP2 127 



8013 



8017 



8018 



8021 




8091 
8100 



787CIP2 163 



787CIP2 164 



787CIP2 165 



787CIP2 166 



8124 
8125 



787CIP2 167 



787CIP2 168 



787CIP2_169 
787CIP2 170 



787CIP2 171 



787CIP2 172 



787CIP2 173 



3124 
3125 



787C3P2_J74 
787CIP2 175 



8141 
8147 



199 



WO 01/57190 



PCT/USO 1/04098 



1 O/i 

1 /4 


1 1 CQ 
1 1 JO 


91/17 
Z14Z 


J IZu 


7o7CJLP2 176 


Q 1 >f Q 

5i4y 


i o< 
I /3 


1 1 CO 


7 1/1 "3 
Z14J 


7177 
J 1Z/ 


OOO/TDO 1 OO 

lo/^LrZ ill 


0 1 CA 

5130 


\ 9£ 
1 /o . 


1 1 £A 
1 1 OU 


7 1 AA 
Z l^Hf 


1 17R 
J 1Z5 


9R9^T"D9 1 0R 

/o/v-/JLrZ 1 /o 


R 1 co 
513/ 


1 97 

III 


1 1A1 
1 101 


71/1 <. 
Z 1H- 3 


1 170 

j izy 


909^107 1 90 

/o/UJLrZ 1 /y 


Rl £1 
5101 


1 9fl 
1 /o 


1 1 £7 

1 loz 


714A 
Z140 


J 1 Jv 


9R9/TD9 1 OA 

/o/Ultz loU 


0 1 £7 
5 10Z 


1 90 

1 Jy 


1 1 f/X 
1 103 


71/19 
Zl*f / 


mi 

JXJ 1 


9R9/TDO 1R1 
/O/^slrZ lol 


R1 

51 03 


1 RO 


1 1 £4 


71AR 
ZlHO 


1 1 17 
0 1 JZ 


7R7PFP7 1 R7 
/ o / ulrz l oz 


81 fkf* 
5 IOO 


151 


1 1 £*\ 
1 10 3 


71 AO 

ziny 




9R7/^TP7 1 Rl 
/O/V-'lrZ 153 


Rl fsl 
510 / 


1 R7 
15Z 


1 1££ 
1 100 


71 ^A 
Zl jU 


J 1 J4 


9C9^ft>9 1 0/1 
/o/ULrZ lo4 


01 /CO 

sioy 


1 Rl 
1 03 


1 1 fn 

l 1 o / 


71^1 

Z 13 1 


1 11^ 


7R9PrP7 1R^ 
/o/^lrZ 103 


R1 9 A 
51/0 


1 RA 
1 o** 


1 1 £R 
1 105 


71 C7 
Z13Z 


1 1 1A 
•5 1 JO 


9R9r*TP7 1 R< 
/o/\^lJrZ 150 


R 1 97 
51 /Z 


1 Rf. 


1 1 £Q 

i ioy 


Zl jj 


1119 
31 5 I 


9R9/^fP7 1 R9 
lol \*lr Z 15/ 


R 1 99 
51/3 


1 Rfi 


1 1 on 
1 1 /u 


71 ^/l 
Z134 


11 1R 
J 1 Jo 


9R9/TD7 1 RR 
/o/'wLrZ 155 


0 1 9/1 
51/4 


1 R7 


1171 

1 1 / 1 


71 ^ 
Z ID j 


1 1 10 

Jioy 


909^107 1 RO 
/S/'OlrZ 15y 


R 1 9/1 
51 /4 


1 RR 


1 1 77 

I i /z 


71 C£ 
Zl JO 


1 1 A(\ 
J 14U 


9R9^TD9 lOI 

to /^LrZ_iyi 


O 1 C9 

515Z 


1 RO 


1 1 97 
1 1 /5 


7 1 C9 
Zl 3 1 


11/11 
3141 


OOO/^TDO 1 OO 

/5/UlrZ iyz 


O 1 OA 

olo6 


i on 

iyu 


1 1 9/1 
11/4 


7 1 CO 
Z135 


11 /lO 

5 14Z 


ooor^roo i qi 
fo /UirZ_iy5 


0100 

5155 


1 01 


I 1 9C. 

I I / J 


9 1 co 
zi jy 


1 1 A1 

j14j 


OOO/^TDO 1 QA 

/o/UrZ iy4 


O 1 O 1 

oiyi 


1 09 

iyz 


1 1 9/C 


O 1 £A 

ZlOU 


o 144 


OOO/^TDO IOC 

/o/Clrz iy3 


O 1 AO 

8192 


i oi 

iy3 


1 1 99 


Zlol 


1 1 A< 

3143 


ooooTTio i a/t; 
7o7LUr2 196 


O 1 AO 

8193 


1 Q/l 
iy4 


1 1 9Q 

11/5 


O 1 AO 

ZioZ 


11 AtL 

3140 


O09/^TI>0 1 OO 

7o/ClrZ_ly/ 


O 1 A/f - 

oly4 


i oc 
iy3 


I 1 90 

I I /y 


O 1 AT 

Zloi 


1 1 /IO 

314/ 


OOO^TDO 1 AO 

7o7Clr2_19o 


O 1 AC 

8195 


1Q< 

iyo 


1 1 OA 

1 loU 


O 1 CLA 

Z164 


1 1 A O 

314o 


OOO/TDO 1 AA 

7o7CUr2_lyy 


O 1 C\£. 

8196 




1 1 O 1 

1 lol 


O 1 *CC 

Zloj 


3149 


TOO/^TTJO OAA 

7o7CLr2 200 


OOAA 

8200 


1 QO 


1 1 09 

1 loz 


2166 


*2 1 <A 

3150 


7o7Cirz 201 


OOA 1 

8201 


1 QQ 

\yy 


1 lo3 


O 1 £0 

2167 


3151 


787Clr2 202 


OOAO 

8202 


oaa 

ZUU 


1 1 QA 

1 184 


O 1 CO 

2168 


O 1 CO 

3152 


TOlPmi OAO 

7o7CJLr2 203 


OOAC 

8205 


OA 1 

2U1 


1 1 OC 

1 185 


O 1 Cf\ 

2169 


3Z53 


7o7CJLP2_204 


8206 


ZUZ 


1 1 0£ 

1 loo 


O 1 OA 

Z17U 


3154 


TO*7r , TTJ1 OAC 

787Clr2 205 


OOAO 

8207 


zito 


1 1 OO 

1187 


O 1 O 1 

2171 


3155 


787CUP2_206 


OOAO 

8208 


OA/I 

ZU4 


1 1 OO 

1 loo 


O 1 OO 

2172 


3156 


787CIP2 207 


OOAA 

8209 


203 


1 1 DO 

1 loy 


O 1 TO 

2173 


3157 


787C1P2_208 


OO 1 A 

8210 


OA/C 

ZOO 


1 1 QA 

liyu 


O 1 O/f 

Zl /4 


O 1 CO 

315o 


lOlPTD'l OAA 

7o7Clr2 209 


OO 1 1 

8211 


OAT 
ZU/ 


1 1 Ol 

1 191 


o i oc 

2175 


3159 


OOOOTDO O 1 A 

787C1P2 210 


8212 


OAP 
ZUO 


1 1 OO 

1 lyZ 


O 1 9/C 

zl /o 


0 1 >CA 

3 loo 


7o7Clr2_zll 


OO 1 o 

82 13 


OAQ 

zuy 


1 lya 


o i in 

2.Y11 


O 1 £1 

3161 


lOIPTTM OIO 

7o7Clr2 212 


8214 


O 1 A 

ZIU 


1 1 QA 

1 ly4 


O 1 OO 

zl Jo 


O 1 /TO 

3162 


OOO/TTIO OIO 

7o7Clr2 213 


OO 1 c 

8215 


Ol 1 

Zi i 


1 1 OC 

i iy3 


O 1 OA 

zl/y 


3163 


7o7Clr2_214 


OO 1 zT 

8216 


O 1 0 
Z1Z 


1 1 Q£ 

i iyo 


O 1 OA 
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787CIP2C_149 


^ AO 1 

6981 


916 


1900 


2884 


3868 


787CIP2C_150 


7000 


917 


1901 


2885 


3869 


787CIP2C 151 


OAO A 

7029 


918 


1902 


2886 


3870 


787CIP2C 152 


000c 

7885 


919 


1903 


2887 


3871 


787CIP2C 153 


8143 


920 


1904 


2888 


3872 


787CIP2C_154 


8143 


921 


1905 


2889 


3873 


787CIP2C_155 


OO T /t 

8234 


922 


1906 


2890 


3874 


787CIP2C_156 


O A Z"0 

8463 


923 


1907 


2891 * 


3875 


787CIP2C_157 


8467 


924 


1908 


2892 


3876 


787CIP2C_158 


8540 


925 


1909 


2893 


3877 


787CIP2C_159 


O C (\f\ 

8600 


926 


1910 


2894 


3878 


787CIP2C_160 


9656 


927 


1911 


2895 


3879 


787CIP2CM61 


9669 


928 


1912 


2896 


3880 


787CIP2C_162 


9695 


929 


1913 


2897 


3881 


787CD?2C_163 


9744 


930 


1914 


2898 


3882 


787CIP2C 164 


9849 


931 


1915 


2899 


3883 


787CIP2D_1 


A 1 OA 

41b0 


932 


1916 


2900 


3884 


787CIP2D_2 


4181 


933 


1917 


2901 


3885 


787CIP2D_3 


4314 


934 


1918 


2902 


3886 


787CIP2D_4 


4500 


935 


1919 


2903 


3887 


787C1P2D_5 


cc c 1 

5651 


936 


1920 


2904 


3888 


787CIP2D_6 


5691 


937 


1921 


2905 


3889 


787C1P2D_7 


coo 1 

5881 


938 


1922 


2906 


3890 


787CIP2D 8 


c 0 OO 

5882 


939 


1923 


2907 


3891 


787C1P2L>_9 


6209 


940 


1924 


2908 


3892 


787C1P2D_10 


6719 


941 


1925 


2909 


3893 


787C1P2D 11 


O 1 ^ A 

8130 


942 


1926 


2910 


3894 


O00/TDOT\ lO 

787C1P2D_12 


5603 


943 


1927 


2911 


n one 

3895 


/o7dJrzlJ_13 


QQAO 

oyOz 


944 


1928 


2912 


3896 


787C1P2D_14 


A1 /TO 

9162 


945 


1929 


2913 


3897 


787CIP2D__15 


A 1 AO 

9197 


946 


1930 


2914 


3898 


787CIP2D 16 


AO 1 C 

9215 


947 


1931 


2915 


3899 


787CIP2D 17 


9232 


V4o 






3yuU 


7S7PTP9D IP. 




949 


1933 


2917 


3901 


787CIP2DJ9 


9369 


950 


1934 


2918 


3902 


787CIP2D_20 


9371 


951 


1935 


2919 


3903 


787CIP2D 21 


9516 


952 


1936 


2920 


3904 


787CEP2D_22 


9601 


953 


1937 


2921 


3905 


787CIP2D 23 


9731 
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954 


1938 


2922 


3906 


787CIP2D 24 


9733 


955 


1939 


2923 


3907 


787CIP2D 25 


9769 


956 


1940 . 


2924 


3908 


787CIP2D 26 


9804 


957 


1941 


2925 


3909 


787CIP2D 27 


9816 


958 


1942 


2926 


3910 


787CIP2D 28 


9844 


959 


1943 


2927 


3911 


787CIP2D 29 


9924 


960 


1944 


2928 


3912 


787CIP2D 30 


9936 


961 


1945 


2929 


3913 


787CEP2D 31 


10163 


962 


1946 


2930 


3914 


787CIP2D 32 


10165 


963 


1947 


2931 


3915 


787CIP2D 33 


10165 


964 


1948 


2932 


3916 


787CIP2D 34 


10244 


965 


1949 


2933 


3917 


787CIP2D 35 


10278 


966 


1950 


2934 


3918 


787CIP2E 1 


4251 


967 


1951 


2935 


3919 


787CIP2E 2 


5310 


968 


1952 


2936 


3920 


787CIP2E 3 


5697 


969 


1953 


2937 


3921 


787CIP2E 4 


5731 


970 


1954 


2938 


3922 


787CEP2E 5 


5733 


971 


1955 


2939 


3923 


787CIP2E 6 


5734 


972 


1956 


2940 


3924 


787CIP2E_7 


5740 


973 


1957 


2941 


3925 


787CIP2E 8 


7657 


974 


1958 


2942 


3926 


787CIP2E 9 


9572 


975 


1959 


2943 


3927 


787CIP2F 1 


1363 


976 


1960 


2944 


3928 


787CIP2F 2 


4303 


977 


1961 


2945 


3929 


787CIP2F 3 


5760 


978 


1962 


2946 


3930 


787CEP2F 4 


5766 


979 


1963 


2947 


3931 


787CIP2F_5 


5767 


980 


1964 


2948 


3932 


787CEP2F_6 


5767 


981 


1965 


2949 


3933 


787CIP2F 7 


5770 


982 


1966 


2950 


3934 


787CIP2F 8 


6855 


983 


1967 


2951 


3935 


787CIP2F 9 


10026 


984 


1968 


2952 3936 


787CEP2F 10 


10227 ■ 



TABLE 6 



seq n> 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X*=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 


2953 


A 


3 


324 


ISEHRDEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CGWTILLTLTMVQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPE) 


2954 


A 


18 


467 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 
YRAFQVIQLSLPEDQKI^ITAATVGQSAVLSCAIQ 
GTLRPPnWKRNNIILNNLDLEDINDFGDDGSLYI^ 
KVTTTHVGNYTCYADGYEQVYQTfflFQVWPPV 
IRVYPESQARRAG 


2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQHIWESDSNEFSV 
IADPRGNTLGRGTTIT*VSIPPSL 


2956 


A 


1 


493 


RTKTDVYELNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWnCFCVWMAAILLSIPQL 

VFYTVNDNARC]PIFPRYLGTSMKALIQMLEICIG 

FWPFLIMGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLNILKNFFLSPLDTRKNKWKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 

NO- 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^Glycine, H=*Histidine, 
I=Isoieucine, K=Lysine, L=Leucine, M=Methionine, 
N s Asparagine, P=Proline, Q^GIutamtne, R«Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop cod on, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ETRSLPACWAQWKSLALPVSRAPGRQGSLVVFP 
LP 


2958 


A 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 
NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFKRGTETRVREIIQHPSAKGNLCPPTN 
ETRKCTVQRKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSICEIPEQRENKQQQ 


2959 


A 


1 


426 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWLIFLRLVLCGFTLVLLVRIICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 


2960 


A 


1194 


852 


EKRKTSYSQCLNSKQR2STVSMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLPA^*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPPTSVRRMPLITTVTLLKMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHII 

SILMGQPMALVQLETLAPLTniQKFQTQDHMKF 

WKNLPLHSHm.TPSVPQTVIPKKTGSPEIKLKITK 

TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKKSNKHDSSRSEERKSHKEPKLEPEEQNRPN 

ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 

HWKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTKINTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 

QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 

PPVKIAWKTAAARKSLPASITMHKGSLDLQKCN 

MSPVVKIEQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLnSTPNQRI^EKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEVVPKKKIK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 


2962 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVLAQNVGTTHDLLDICLKRATV 

QRAQHVFQHA VPQEGKPITNQKSSGRC WIF SCLN 

VMRLPFMKKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLRmVHSGATKGErSATQDVM 

MEEIFRWCICLGNPPETrTWEYRDKDKNNKKIG 

P\ITPLErTSTR/EQrTVXPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPEDFLK 

KMVAASIKDG\EAVWFGCDVGKHF\NSKLG\LSD 

MNLYDHELWGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKW\RVGEFQWG 

EDHGH\KGYLCMTD*VGSLEYVYEVVA r WDRKH 

VP\EEVLA\TLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKITLKIAKNYLEQRAVGGASPRLAQS 
VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 
DRMKTTIKETST*LSNS YL VFPLM* SLTYLMKMS 
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SEQXD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










FERCTAIO^KMFVNSPFTKVDNYCTVSS\WKKFYL 
KCYFSLNTIKKEKKMT 


2964 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEQNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

WKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKD VP/IAC A S A * GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAEPTS 

QPPSATPG*PRRHLKEQNLS\VKVFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSEHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

J\yj w o Jj^iv^ V oortiKlJIN Ja 1 r IN aOL/ovjv^CrlJoKbM F 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVF1FHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL \ 

KRGAIYGSSW 


2965 


A 


3 


2454 


FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKECQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDW/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AGW^n^OV^^PPi?n>JPTTrNjQr;nQr;nr^riQP ca/tt 
vv oL/ooy v oox nivLJiNr!/ 1 riNoOi^ovjv^Oj^oivolVl l 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPWGTYVFIFHMLKLAV1WPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HRGAIYGSSW 


2966 


A 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 

LG1AYVMANTGVFGFSFLLLTVALLASYSVHLL 

LSMCIQTAYLGP*TNYFMVLPAH*LTCLPLIEFLQ 
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SEQID 
NO* 


Method 


Predicted 

hptrinninp 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alnnine OCysteine, D=Aspartic Acid, 
F«=G!ntflmif» Acid F^Phenvlalanine. G^Gtvcine. H=Histidine. 
I=Isoleucine, K=Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLWAGTTIIQ 

NIGAMSSYLLIIKTELPAAIAEFLTGDYSRYWYLD 

GQTLLinCVGIVFPLALLPKIGFLGYTSSLSFFFM 

MFFALVVIIKKWSIPCPLTLNYVEKGFQISNVTDD 

CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 

LQSPSKKRMQNVTNTA1ALSFLIYFISALFGYLTF , 

YD/GTTKAQRGEVTCHRIKDKVESELLKG* * *EP* 

SHDWVMT\VKLCILFAVLL\TVPLIHFPARKAVT 

MMFFSNFPFSWIRHFLITLALNUIVLLAIYVPDIRN 

VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSILRNSLSVYEDLPASRKSIYFK 

I 


2967 


A 


3 


3222 


SGIVVRALWREKKPGGGRRVKRJRl^GRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVICRNLEKYGLNELP 

AEEGKTLWELVDEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENA1EALKEYEPEMGKVYRADRKSVQRIKARD 

IVPGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTEIGKrRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTD VRSLSK VERANACNS VIRQLMKKEFT 

LEFSRDRXSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVIDRCNYVRVGTTRVPLTGP VKEKIMA VIKE 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEVVCIFLTAALGLPEA 

LIPVQLLWYNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLRSGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRMPPWVNIWLLGSICLSMSLHFLELYVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEELKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

IQELEELGVGIGWHAGYERRLAHHLGAHSTPSI 

LGHNGKISFFHNAVVRENLRQFVESLLPGNLVEK 

VTbfBOSTYVRFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 

NIYAPTLLWKEHINRPADVIQARGMKKQIIDDFI 
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SEQXJ> 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleuctne, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T^Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










TR^YLIAARLTSQKLFHELCPVKR^HRQRKYC 

VVLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 

AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 

CWDSIFHNNW\REMMPLLSLIFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTXENS 

oisjjrjsjvvjrr v JD v l EL, lUv I i 1 oxNL, V ivL/Jtvr LrrlMIN V 

VLILSNSTKTSLLQKFALEVYTFTGSSCLHFSFLSL 
DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 
TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 
SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL ' 
WMERLLEGSLQRFYTPSWPELD 


2969 


A 


48 


1117 


KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQIIWLFERPHTMPKYLLGSVNKSVVPD/YGI 

P/YTSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 

NGTLSASQKIQVTVDDPVTKPVVQIHPPSGAVEY 

V LriNlVi l JL# 1 Url ViiO<ali\JLA r ^WLJsJNLrKr Vrll oo 1 

YSFSPQNNTLfflAPVTKEDIGNYSCLVRNPVSEM 

ESDIIMPEYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSWIRRTDNTTYriKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVIITSVGMCDIQGRDPNKT 


2970 


A 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 
QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 
FLTLLCLLLLIGLGVLASMFHVTLKIEMKKMNKL 

AIXLCRELYSKEQEHKCKPCPRRWIWHKDSCYF 
LSDDVQTWQESKMACAAQNASLLKINNKNALE 
FIKSQSRSYDYWLGLSPEEDS/YSWYESG*YNQ\P 
SAWVIRNAPDLNNMYCGYINRLYVQYYHCTYK 
QRMICEKMANPVQLGSTYFREA 


2971 


A 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 

LVAFAYWNHYLSCTSPCSCYRPLCRLNFGLNVV 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 

IVFIASSLGHMLLTCmWRLTKKHTVSQE\DGLSL 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRWGQNLGL*HCVWVVWE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

niN Okjr\xx ivi i ivjivj.r r or l vr vjvjo v vjtvj L»ri v lr JJLjix 

PEVEAAGIPLLLGPSLPQRQGREHIWILAAPACA 
PFHDR*WEPREERPSP*ELGLRGEPTLSYPASCRVT 
RQPIP*DRKSYSWKQRLFIINFISFFSALAVYFRHN 
MYCEAG WTIFAILE YTVVLTNMAFHMTA WWD 
FGNKELLITSQPEEKRF 


2972 


A 


1734 


246 


GGDLSGRDGRTALPRPREPAERTAGLRRDMRPQE 

LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 

LDAROLPAWFDOAKFGIFIHWGVFSVPSFGSEWF 

WWYWQBCEKIPKYVEFMKDNYPPSFKYEDFGPL 

FTAKTFNANQ\WADIFQASGAKYIVLTSKHHEGF 

TLWG\SEYSWNWNAIDEGPKRDIVKELEVAIRNR 

TDLRFGLYYSLFEWFHPLFLEDESSSFHKRQFPVS 

KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=>Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V-Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










STGFLAWLYNESPVRGTWTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDKLSWGY 

RREAGISDYLTffiELVKQLVETVSCGGNLLMNIG 

PTLDGT1SWFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTfflQMPCKWGWALALTNVI 


2973 


A | 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQWKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRWVPKrTVVKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKXTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYCDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCLGVNHIHKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARLVQKCLPPEIIMEYGEEVLE 

EIKNSKHNTPRKKTNPSRIRIALGNEASTVQEEEQ 

DRKGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSPNLHRRQWEKNVPNTALTALENASILT 

SSLTAEDDRGGSVIKYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DKPDWVSELKKRAGWQGLCDR 


2975 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG ! 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNWTGELAAIKVEKLEPGEDFAWQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAJELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQUPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQE3VIHSTEDENQGTIKRCPMSGSP 

VAKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMhO^CLLSISGKASQLYSHNLPGLFDYA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine,D=Aspartic Add, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methi onine, 
N=Asparagine, P=Proline, Q=Glutaminc, R^Arginine, S=Serine, 
T=Threonine» V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 










RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 

WCQKCCVVROTYTGHKYLCGALQTSIVLLEWV 

jbrMv^Kr MLJUKJHroFPIPCPLKMFEMLVVPEQEYP 

LVCVGVSRGKDFNQVVI^ETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYELAGHENSY 


2976 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 
r QPVNPVFDLSRRKPQEDFELIQRIGSGTYGDVYK 
ARNVNTGELAAIKVIKLEPGEDFAWQQEIIMMK 
D\CKHP\DIVAYF\GSYL\RRDKLWI\CMEF\CGSGS 
\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 
KMHRDIKGANELLTDNGHVKLADFGVSAQITATI 
AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 
WAVGITAffiLAELQPPMFDLHPMRALFLMTKSNF 
QPPKLKDKMKWSNSFHHFVKMALTC 
AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 
HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 
FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 
TAR5NLDLQLEYGQGHQG\GYFLGANKSLLKSV 
EEELHQRGHVAHLEDDEGDDDESKHSTLKAKEP 
PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 
\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 
ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 
LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 
TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 
CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 
WCQKCCVVRNPYTGHKYLCGALQTSTVLLEWV" 

LVCVGVSRGRDFNQVVRFETVNPNSTSSWFTES 

DTPQTNVTHYTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYELAGHENSY 


2977 


A 


174 


1543 

< 


YSLRKGITFKLAGAMVHIKKGELTQEEKELLEVI 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 

LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 

AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 

GLDKEPKLPPKLAGPLHOITTTNLHPVKIVMLV 

NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 

NE\O.AMKMHYISCIFQKCI^KDGENKLDTLK 

iLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

QQLVRSIAPVEIGSDPTAFSVLTQAITGQVGFVDV 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 

FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 

LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 

KESLESEAELEGLODAPAGPOVSEE 


2978 


A 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRJTPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, ENAspartic Acid, 
E=Glutnraic Acid, F=Phenyla1anine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G)utamine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

WKPFSEFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQmCGRQIICSYL 

SQSIELKWQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVIQVPSSNSSHYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPIIIHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKIIIPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEWTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVEHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSDfflELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQWFLTGFGYVYVDW 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRL'ELDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEEYKEKCFIKLCITLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYDCTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSMASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSrTVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEVL 

FWSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPfflG 

NYRLLKTIGKGNFAKVKLARHILTGKEVAVKIID 

KTQLNSSSLQKLFREVRIMKVLNHPNIVKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQIVSAVQYCHQKFIVHRDLKAENLLLDA 

DMNIKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

NLKELRERVLRGKYRIPFYMSTDCENLLKKFLIL 
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NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCysteine, W=Aspartic Acid, " 
^Glutamic Acid, F=Pbenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, ^Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Ai#nine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 



NPSKRGTLEQ1MKDRWMNVGHE\DDELKPYGEP 
LP\DYKDPRRTELMVSMGYTREEIQDSLVGQRYN 
EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 
SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 
YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 
SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLL\E 
PvASL\GQGFHPEWAKTALTMPGSRASTASASAA 
VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 
VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 
PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 
VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 
PESKDR\VETLRPHW\NSGGNDKEKEEFREAKPR 
SLRFT WSMKTTS SMEPNEMMREIRK VLD ANSCQ 
SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 
RLSLNGVRFKRISGTSMAFKNIASKIANET.KT 



2981 A 



3433 



120 



3433 



MCLLLQAKUI-HGEIEDLQQWLI'D'I'ERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
QKGQQMLARCPKSAETN1DQDINNLKEKWESVE 
TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
SKLMEWLEESEKSLDSELEIANDPDKIKTQLAOH 
KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 
NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 
LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 
DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 
RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 
KQTRLEAALRQAEEFHSVVHALLEWLAEAEOTL 
RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 
LNKATTMGDTVLA1CHPDSITTIKHWITIIRARFEE 
VLAWAKQHQQRLASALAGLIAKQELLEALLAW 
LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 
EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 
LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 
LVSKWQQVWLLALERRRKLNDALDRLEELREF 
ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 
DQDGKITRQEFIDGILSSKFPTSRLEMSAVADrFD 
RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 
DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNO 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 
VKNDPCRAKGRTNMELREKFILADGASQGMAA 
FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPO 
WATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 
DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 
GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 
PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 



NCL^AKUi-HUElEDLQQ WLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 
TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 
QSRWEKWQRLVERGRSLDDARKRAKQFHEAW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
£=Glutamlc Acid, F-Phenylalanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPD^ITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFEDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A 


1 


2065 


MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSIDLNKPI 

DKRIYKGTQPTCHDFNQFTAATETISLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL ! 

KQVAWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYWTGGEDDLVTVWS 

FTEGRVVARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATLTLQERRDRGAEKEHKRYHSLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PLLEPLVCKKIAQERLTVLLFLEDCIITACQEGLIC 

TWARPGKAFTDEETEAQTGEGSWPRSPSKSVVE 

GISSQPGNSPSGTW 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 

LQELNANLSNLTSAFEKATAEKIKCQQEADATN 

RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

CGD VLLIS AF V S YVG YFTKKYRNELMEKFWIP YI 

Hh^KWIPITNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATELGNTERWPLIVDAQLQGIKWIKN 

KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPIXGRIsTTIKKGKYIKIGDKEVGVPP 
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Method 



SEQ ID 

NO: 



PCT/US01/04098 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Ammo acid sequence (A=Alanine Cysteine, l>=Aspaitic Acid, 
E-Glntamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
Wsoleucine, K=Lysine, L=Leucine, M=Metfaionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine 
J-fnreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 



2984 



1464 



1890 



178 



QVPPDn'HQVLQPTLQARDAGSVHVLINFLVTRD 
GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 
IVLKELEDSLLARLSAASGNFLGDTALVENLETT 
KHTASEIEEKVVEAKITEVKINEARENYRPAAER 
ASLLYFILNDLNKINPyYQFSLKAFNVVFEKAIQR 
TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 
KLBFLAQVTFQVLSMKKELNPVELDFLLRFPFKA 
GWSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 
EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 
LCMVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 
VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 
LGFTIDNGKLHNVSLGQGQEWAENALDVAAEK 
GHWVILQNMLVARWLGTLDKKLERYSTGRHED 
YRVFIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 
YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 
CYFHAWAERRKFGAQGWNRSYPFNNGDLTISI 
NVLYNYLEANPKVPWDDLRYLFGEIMYGGHITD 
DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFOIPP 
NLDYKGYHEYIDENLPPESPYLYGLHPNAEIGFL 
TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 
KAVLDDILEKIPETFNMAEIMAKAAEKTPYVW 
AFQECERMNILTNEMRRSLKELNLGLKGELTITT 
DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 
YANLLLRIRELEAWTTDFALPTTVWLAGFFNPQS 
FLTAIMQSMARKNEWPLDKMCLSVEVTKKNRE 
DMTAPPREGSYVYGLFMEGARWDTQTGVIAEA 
RLKELTPAMPVIFIKAIPVARMETKNIYECPVYKT 
RIRGPTYVWTFNLKTKEKAAKWILAAVALLLOV 



FVLFPGIAMETPGASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAIKFGSALGJCM 

SREPPPPYVTPATFETPEVHAGTGWGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVETWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQWHKNTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFK1LEPGRRERLGLKMANE 

AAAKMRAKKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 
DNLLQLPARRKASDFF 



ASl^EAULLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSOP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTPRETGKLKGEATVSFDDPPSAKA Am 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine,D=Aspartic Acid, 
E=GIutamic Acid, {^Phenylalanine, G=C!ycine, H=Histidine, 
I=Isoleucine, K=Lysine, t=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

M^TIIFVQGLGEm^TIESVADYFKQIGnKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 


2987 


A 


1376 


898 


GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 
WAGARQHGRNWRKRETSPGTQGPLPPVPRA^PP 
GPDG\PHAIAPTLSWAIPRQQCSPQPGRLNALPPD 
RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 
CPPQEPSLRSSRNRLREGQTFGRMEI 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAEDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIDLLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCVVDYGGSSSTEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 

LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 


2989 


A 


27 


4074 


KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTEPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSERH 

SPLSSGISTPVTNVSPMHLQHIREQMA1ALKRLKE 

LEEQVRTEPVLQVK1SVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYEDYEEEEMETVEQSTQRDCEFRQL\TADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEIELQQQTIESLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQfl) 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



rredicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



69 



2993 



1687 



1159 



PCT/US01/04098 



S ?T? Alanine O^teine, O^Aspartic-Add" 
E-Glutamic Acd, F=Phenylalanine, G=G!ycine, H-HwSdT 

^tr C,nC ' K ; Ly . Sine ' ^ Ud ^ M=Methionine 
N-Asparag,ne,P=Proline, Q-Glutamine, R=Arginine S=Wi„> 
^Threonine, V=VaIi„e, W=Tryptophan, 

SSSr^ il£ P C0d ^^sible nudeo/de deletion, 
v-possible nucleotide insertion ' 



v VHA w Q^^m VGSHMDLVDTCVGTT 
^TOSVGISCQPECKNKWGPELPMNwiJS 

^mCGPPQLTVGLTASRRSVGVGD^VGEsL 

asss ssssssS 

lntlqhewfrvssqksaipamvgdyiaafeS 
dvlryvinladgngntalhysvshsot™.li 
dadvcnvdhqi^gytpimlaaTa^eSm 
rtveelfgcgdvnakasqagqtal^.avshg^i^ 

miAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 



ldqi^qvtftkrktgiWLyelsvS?IS 

T^ILETLKRRGIGLDGPELFJDEGPEEPGEOR 
PPPG\CDPSGLGEALPAQSRPSPFRPAAPKAGPPG 

lgotlfspshltsktpppIylpteg^dl^ggS 

GPRCKJLNTSRSLYSGLQNPCSTATP^PLGSFm 

pggppvgaeawapjivpqpaapSpos?S 

L^RPPGAPATFLRPSPIPCSSPGP^sSGfS 
CAGCPWPTAGPGRRSPGGTSPERSPGTSSnP 
\TSLQAFSEKTHTVTAPLRGGGLEV^JGWTOSSAf^ 
^^^Sl^ARGVRGPElS Q ° 

^LHCASPKEEMSLPXGDAl ^lLGPRVFGRYF 




=ssssS 
ssssss 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-Cysteine, D«Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucme, K«Lysine, D=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKHLILFLGDGLGVPTVTATRILKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVXANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVTTTRVQHASPAGTYAHTV 

NIWWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGERLDG 

KNLVQEWLAKJIQGAWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEMRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYTLRGSSIFGLAPSKAQDSKAYTSILYGN 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 


2993 


A 


3 


685 


DAWARLLKMNRLFGKAKPKAPPPSLTDCIGTVD 

SRAESroKKISRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQKRMYEQQRDNLA\NSHSTW\ 

TSVrT^TIQSLKDTKTTVDAMKLGVKEMKKAYKQ 

VKIDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFnKTLILPKSWGAFPEDVVMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

YIQVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSIIQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

MAQMRKQCLDYHHQEMQALKEVFKEYLIELFF 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 

EEEEEE\HFEVTNDEVKVVARKHGQPGTPVAIA-n 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 


2995 


A 


3 • 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 

APATTSSWEVVRNPLIASSFSLVKLVLRRQLKNK 

CCPPPCKFGEGKLSKRLKHKDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHVAAT\QALPQDSGTAAWKG\RV 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAW 

EPMLWNPSGTPKRYSLELGKAIKQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPBCLK 

SKK 




A 


J 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQKRCQEKHNKLLSRTTrXNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKHNLDIJ1IHNKSNAAKN 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKIFTQRSHFFAPQKIHT 
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SEQI 




r-r^™ _ PCT/US01/04098 


NO: 


u | Medio! 


beginning 
nucleotide 
location 
correspond^ 
to first amine 
acid residue < 
peptide 
— _ sequence 


rreaicted en 

nucleotide 

location 

corresponds 
g to last amino 
> acid residue 
rf peptide 
sequence 


■g N-A^parag.re P =p r0lill e ) Q=a utemilIe> R=Al ^ n 7 ne ^ 

^lr n ' riS P COd ° n ' /=possible DU ^otide deletion 
V-possible nucleotide insertion ««enon, 


4 
e» 


2997 " 


f- 

A 






SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHr 


\ 
E 
5 


^998 


A 


3 


1763 


FI^GGPTGIG/VTFGYFNSDRLGRRVV^SSS 
MFLFGIAAAFAVDyYTFMAAPJFL^Arvr v 

VGFWVMEFIGMKSRT^S^VCT^V 
ALTGYLVRTWWLYQMILSWTVPFILCC^PF 

^^lsegryeeaqkwdimaSSSI 

ALACGVV^IVIPQKHYILGWTAMVG]^!pK3AA 
APFSVDLSSIWfflPQLFVGTMALLSWL^LPE 




"2999 J. 


l 3 




1441 

I 

I 


^gdsmhsdifparhysnwetWeSgdS 
isqrpeeseplekkgkyegpkakplnSSwS 
;faidsvlglentedslvytw^S? s S Ssi 




* 


20 ' 2 


417 "1 
E 

A 

PJ 
N, 
V] 
Dl 

Ql 


•RRRJ^llP^LLOULFLLSLLFLVOGAHGRGffR- 

A-rVWKLQPTAGLQDLHIHSRQEE EQSEIMEYS 
jagjWTLQCV?i^ms, wr ,Bw4"5- 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=G1utamic Acid, ^Phenylalanine, G-Glycine, H-Histidine, 
I=Isoleucine, K»Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X~Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 

KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 

CRRKPRDYmVHMNLLLAVFLLDTSFLLSEPVA 

LTGSEAGCRASA1FLHFSLLTCLSWMGLEGYKLY 

RLVVEWGTYWGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPIILAVHRTPEGVIYPSMCWIRDSL 

VSYITNLGLFSLWLFmiAMLATMVVQILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LVVHLYLFSnTSFQGFLIFIWYWSMRLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGWFSVTTVD 

LKRKPADLQNLAPGTHPPnTFNSEVKTDVNKIEE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

YIKNSRPEANEALERGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIBCFSTRKFLDGNEMTLADCNLLPKL 

HIVKWAKKYRNFDIPKEMTGIWRYLTNAYSRD 

EFTNTCPSDKEVEIXAYSDVAKRLHQVKSRLLKE 

VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTXCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 


3002 


A 


909 


2799 


VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGD SD AC GKGFNH SME VIHGRNP VREKP YK Y 

PESVKSFNHFTSLGHQKIMKRGKKSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFkRNSS 

LVLHHRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNSSLILHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFffiSAYLIRHQRIH 

TGEBCPYGCNQCQKLFRNIAGLIRHQRTHTGEKPY 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

EVKIRDWYQRQRPSEIKDYSPYFKT1EDLRNKIIA 

ATEENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LAYLRKNH*EEMLALRGQTGGEVNVETDAAPG 

VDLSCILNEMRNQYEQMAEKKRRDAETWFLSKT 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



940 



2552 



1 



>3 



"Anw a c,Q sequence (A=Alani ne OCysteine, l^Asnartic AcM ~ 

"J f^nyialanine, Glycine, fiS^ 
I lsoleucme, K=Lysine, ^Leucine, Methionine, 
N : Asparag,ne Proline, <>=Glutamine, R-Arginine, S=Serine 
X^„Ln 0n,D ^ V : Va,ille, W==T ^Phan, Y=Tyf osine ' 
X-Unknown, *=Stop codon, ^possible nucleotide deletion 
V=possibIe nucleotide insertion °«enon, 



SSRQTgPn.KEQSSSSFSQGQSS 



GCAPDTRF^ V^PGUROAAPWVALVARGGCTF K 
DKVLVAARRNASAVVLYNEERYGNTTLPMSHAG 
TGMVVIMISYPKGREILELVQKGIPVTMnGVGT^ 
^QEFISGQSVVFVAIAFIT^ S LA™S) 
RFLYTGSQIGSQSHMCETKKVIGQLLLHrS? 

KGIDVDAENCAVCIENFKVKDnRJLPCKMFKUC 
mPWLLDHRTCPMCKLDVIKALGYWGEPGDWE 
^APESPPGRDPAANLSLALPDDDGSDESSPPSA 

sp^sepqcdpsfkgdagentaxleaS^ 



TMTIH< 



tg^yqyvgklhsdqdkgdgslkyilsgdgTgt 
lfitoektgdihatpjiidreekafytlraoa^Srp 
tlrpvepesefvkihdindneptfpeeotasv^ 
mswgtswqvtatoaddpsygnsarvSo 

??^ S V™IIRTALPNM^^S^S 
NTmRVLESSPVGTAIGSVKATDADTGKNAEVE 

y^gdgtdmfdivtekdtqegiitvkSldyS 

RRLYTLKVEAENlTmjPRFY^GPF^S 

EDVDEPPVFSRSSYLFEVHEDIEVGnrawSn 

PDSISSPIRFSLDRHTDLDRIFMHSGNGSLYTreW^ 

LDRELSQWHI^WIAAEINNPKE^RVAWVIUL 

DANDNAPQFAVFYDIWCENARPGQLr Q TOA^D 
^DPLCKJQKFFFSLAAVNPI^QD^gN^ 

TOKNGFNlUffilSTYLLPVVISDNDYPIQSSTGTS 
RVCACDSQGl^QSCSAEALLLPAGLSTGlL^r? 
LCmLLVIVVLFAALKRQRKKEPLILSraDn^W 

iipeilfiprrtptapdntovrdfinerlkeSldp 

TA^SLATYAYEGNDSIAESLSSLESGTTCGD 



GRVDKTWWGKSVGmm EJOVLNSimVYHK Y 
SLKGNFHAVYRDDLKKLLETECPOYffiKKGAn 
VWFKELDINroGAVNFQEFLILvSS^^n 
DVYHKYSLKGNFHAVYM)DLQHXETE^POYl 



gqgrfgnqai>hflgslafak2Savp^ 

YQHHKPPFTmHVSYQKYFiaEPLQAY^vS 

dfmeklapthwppekrvaycfev^qrIpdSt 

ctmkegnpfgpfwqfhvsfnkseTftgiS 

ymqwsqrfspi^vlalpgapaqfpvleeiSp 

LQKYMVWSDEMVKTGEAQIHAli^YV?^ 

rjgsdwknacamlkdgtaWmastocvgyI 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAO 
SVWA7DSESYVPELOOT.F^v^r?l SL ^ Q 
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SEQD) 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K«=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 


A 


3136 


1898 


TARGGGSEPGPlMAANYSSTSrRREHVKVKTSS 

QPGFLERLSETSGGMFVGLMAFLLSFYLIFTNEG 

RALKTATSLAEGLSLVVSPDSIHSVAPENEGRLV 

miGALRTSKLLSDPNYGVHLPAVKLRKHVEMY 

QWETEESREYTEDGQVKKETRYSYNTEWRSEH 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLBDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HVVTVIARQRGDQLVPFSTKSGDTLLLLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTLVT)WFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


DAAVAMTAQGGLVANRGRRFKWADELSGPGGG 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DRILVEKRCWD1ALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMA WRPIQ ALMAIS ATFKMLESS S 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


3010 


A 


2 


1041 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VVPECTMASSNTX^MRLVASAYSIAQKAGMIVR 

RV1AEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLVVWVDPLDGTKEYTEGLL 

DNVTVLIGIAYEGKAIAGVINQPYYNYEAGPDAV 

LGRTIWGVLGLGAFGFQLKEVPAGKHnTTTRSH 

SNKLVTDCVAAMNPDAVLRVGGAGNKIIQLIEG 

KASAYVFASPGCKKWDTCAPEVILHAVGGKLTD 

IHGNVLQYHKDVKHMNSAGVLATLRNYDYYAS 

RVPESIKNALVP 


3011 


A 


291 


1452 


SPQKTMRSHTITMTTTSVSSWPYSSHRMRFITNH 

SDQPPQNFSATPNVTTCPMDEKLLSTVLTTSYSVI 

FIVGLVGMIALYVFLGIHRBCRNSIQrYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKVVGTLFY 

MhMYISIILLGFISLDRYIKINRSIQQRKAITTK 

YVCCIVWMLALGGFLTMIILTLKKGGHNSTMCF 

HYRDKHNAKGEAIFNFILVVMFWLIFLLIILSYIKI 

GKNLLRISKRRSKFPNSGKYATTARNSFIVLIIFTI 

CFWYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSCLDPVMYFLMSSNIRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPITPDMQVQENFMSRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

WHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

QLRETLLQDFRVVAQGVGIPEDSIFTMADRGECV 

PGEQEPEPILlPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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SEQID j Metho 
NO: 


d Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

_ se qaence 


iTedicted ei 
nucleotide 
location 
correspondi 
to last amim 
acid residue 
peptide 
sequence 


_ PCT/US01/04098 


3 


014 A 


1 






3 


015 _ A 




373 




301 


6 A 


2 

O I—: 


1321 




3017 


A 


1321 

2P : , 


Qcqwrqppgkeiyrksnisv^vdgSSS^, 

K^PITGGWGAAVCRGRWGSVST^^^^Q^^ 


3018 


A 2 


640 


1 /U4 
1 


^AaAIWDTAGQERFRlLIPSYYRGAOGvrr 
^^SS^ LWESENQNKG 


3019 


A 1 


307 


2861 p 

B 




3020 


A 12 


02 


■ p 
' FJ 

Q 
Gi 
R( 
EJ 


^^^PRG^ DFGWD p CFQpDG ^ Q ^ 








180 V5 
L\ 
Ml 

QI 


^AL^LIFIMTXPFRMF^SS^SSJ 1 
[ ^TVFWSIALm^^ 
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SEQID 
NO: 


Method 

•* 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

11 ULICUIIU c 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJnmne OCysteine, D=Aspartic Acid, 

F=rZhitn mir Arirl F=Ph*>nvl<>loriin#» f^=nivrinf HmHicHrlinp 
jc — Vjiu til in iv ntiii) iiciiy luiuninCf vf-""\jijciiic, ir-TiiMiuMic, 

I^IsoleucinejK^ysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=G1utamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=*Stop codon, A=possib!e nucleotide deletion, 
\=possib!c nucleotide insertion 










AKELKOTCKAVLACVGVWIMTLTTTTPLLLLYK 

DPDKDSTPATCLKISDIIYLKAVNVLNLTRLTFFF 

LIPIJ^IMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRl 

HTLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNLSTCLDVILYYIVSKQFQARVISVM 

LYRIsTYLRSMRRKSFRSGSLRSLSNINSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAJEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
. 'RWLfflYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 
TASETGFLTYLDVSVGKIVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 


3022 


A 


1 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLYQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQKJvIFECSECEESFS 

KKCHLILHKIfflTGERPYECSDREKAFffiKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYLLLKRSGREITWKDFVNNYLSKGVVDRL 

EWNKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQDD 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



PCT/US01/04098 

Amino acid sequence (^Alanine Cysteine, D=Asparti^Acldr 
E-GIutam,c Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, KHLysine, L-Leucine, M=Methionine, 
N-Asparagine, ^Proline, Q=GIutamine, R=Arginine, S^Serine. 
^Threonine, V=VaIine, W^Tryptophan', Y^yrosine, ^ 
A-UDknown, *=Stop codon, /-possible nucleotide deletion, 
\-possibIe nucleotide insertion 



3024 FX 



274 



3026 rx 



621 



1455 



306 



1533 



454 



3027 



3028 



179 



703 



876 



1226 



ra^LblLQQKLGIEGENRVPVVYIAESDGSFlXs 

^PmilAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

WLKNPKQYQDLGAKIPKGAELTGPPGTGKIXLA 

KATAGEANVPFITVSGSEFLE3VIFVGVGPARVRDL 

^^™^ C ^ FmEIDAVGRKR GRGNFGGOSE 

QENTLNQLLVEMDGFNTTTNWILAGimPDELD 

PALLRPGRFDRQIFIGPPDIKGRASIFKVHLRPLKL 

?fJ^^ A ^ ASL ^FSGADVANVCmAA 

^S^^^QYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAQDDLRKVTQSAYAOIVO 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRJLIHDAYKRTVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 



LRACSLPSMSALEKSMHLGKLPSRPPLPGSGGSO 
SGAKMRMGPGRKRDFSPVP WSQ YFESMED VEV 
ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 
WAVFTAAnSRVQCRIVALDLRSHGETKVKNPED 

2^3 TASSNLVPSLLGLCMIDVVE GTAMDAL 
NSMQNFLRGRPKTFKSLENAIEWSVKSGOIRNLE 
SARVSMVGQVKQCEGITSPEGSKSIVEGIIEEEEE 
DEEGSESISKRKKEDDMETKKDHPYTWRIELAKT 
EK YWDGWFRGLSNLFLSCPIPKLLLLAG VDRLD 
KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 
AEAVATFLIRHRFAEPIGGFQCVFPGC 



^^ KAGGSFRSVQCiWtiU ^ L ^ PF RTSKSL 

SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 

HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 



AKVPyS IRbtKRENGLEARSPAINLMGFNVEEM" 
YEAHAWIQRILSLQNHHIIENNHILYLGRKEHDIL 
SQLQKTSSVSITOnSPGRTELEIEGARADLIEWM 
NIEDMLCKVQEEMARKKERGLWRSLGOWnoO 
QKTQDEMKEiraFLKCPVPPTQELLDQKKOFEKC 
GLQVLKVEKTONEVLMAAFQRKKKMMEEKLHR 
^™ FQQVPYQFCNVVCRVGF QKMVSTPCD 
PKYGAGIYFTKNLKNLAEKAKKISAADKLIYVFE 
AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSWD 
NVSSPETFVIFSGMQAIPQYLWTCTQEYVOSODY 
SSGPMRPFAQHPWRGFASGSPVD 



PFHLoASSNil-KLQVQ'rQESKAQKEVKMGFFSK 
SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGITFGLAAISL^AGAI 

™aflvpivplsfiltyqydlgygt£Eer^ 

GEAEDILETEKSKLQLPRGMTTFESIEKARKEQSR 
FFIDK 



DNKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 

^i^ GWDKLSPA QRPSLGFARRlRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQID 
NO: 


Method 


Predicted 

Beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

IIUVlvUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteioe, D=Aspartic Acid, 

I=Isoleucine, K=Lysine, L=Leuclne, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=»Tyrosine, 
X^Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 1 


3029 


A 


3 


1731 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRG VGDPAMIS SNTS YL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLILISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030 


A 


1 


584 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEIFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 

CWSSCGQHPVQATHRGAVSNSLMLCILKLASQM 

PLENTTVQQMVFMLLSNLALSHDCKGVIQKSNF 

LQNFLSLALPKGGNKHLSNLTILWLKLLLNISSGE 

DGQQMELRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FHWCFSPANKPKILANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDEP 

LSRPKKKKPRTKNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPEAP VQRRQKKTRLPLELETS ST 

QKKSSSSSLLRNENGDDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDIITDEQTTVEQQSVF 

TAPTGISQPVGKVFVEKSRRFQAADRSELDCTTEN 

IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 

FLAGCAVWNIWIYVLAGDQLSNLSNLLQQYKT 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVAERNF 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 

SSVNGSLWEAGffiEQILQPWIVVNLVVALLVGLS 

WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEDCA 

SS 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNrTRPNEKGEYEVAEGIGSTVFRAE.DYYKTGU 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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3034 



""Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



JTredicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



PCT7US01/04098 
Ammo ac»d sequence (A=Alanine O lysteme, l*=Aspartic Acid 1 



1972 



3035 



1172 



CHI V VLTUDDWDWDEEY^iGEEYSOnYSTr 

™KVKKRPGGRPEVIYNYVQRPFIRMS^iS 
EGKSRHVDFQCVKSKSITNLAAAA^JOT^D^^ 
g^ffTPQVDELDILPIHPPSGNSDLDPDAONP^ 



DESDVPAEIQVLKEPLQQPTTPFAVANOLLLVSL 

«y^™ LAGLQHPNIVGY HTAWIEHVHVIOP 
RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

ssiifaeptpekekrfgesdtenqSks™™ 

VIRESGELESTLELQENGLAGLSASsSJS 
RNSHLEESFTSTEESSEENVNFLG^raAO^S; 

y^NVAraFQELVEGVFYIHmiGjWL^R 

^lhgpdqqvkigdfglacidilqkntdW 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

myslgwllelfqpfgtemerAe^tglrtcol 



p^qsaceefwarlrklnwkrilirhredipfdgto 



' FfF^KKAAAAESD^AklWAGRSMQAARCPTO 
ELSLTOCAVVNEKDFQSGQHVIVRTSP^YTFT 

TTDKAKQCIGTMTIEroFLQKKSIDShffTOTDKM 
AAEFIQQFmfQAFSVG^LVFSFNEKXFGL™ 
IEAl^PSILNGEPATGI^KIEVGLWGNSO\S 

IGGLDKEFSDIFRRAFASRVFPPEr^OMGcSrJ? 

PEILNKYVGESEANIRJCLFADAEEEORRLGANSG 
LHinFDEIDAICKQRGSMAGSTGVTOnSS 
KlDGVEQLNNlLVIGMimPDLmE^L^PXEV 
^LPDEKGRLQILHmTARMRGHqSv 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G (s Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, I^Leuctne, M=Methionine, 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










LVLLKKAPPQGRKLLnGTTSIUGDVLQEMEMLNA 
FSTTIHVPNIATGEQLLEALELLGNFKDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 


3037 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNGSAESL 

GEGGTRDSDRARRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

PEDTCARLRCSLHASLLVVTLNPDQCHPSRKRRA 

AIPWKLSCKNLCHRHQLFINFRJDLGWHKWIIAP 

KGFMANYCHGECPFSLTISLNSSNYAFMQALMH 

AVDPEIPQAVCIPTKLSPISMLYQDNNDNVILRHY 

EDMWDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLWSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSELSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 


3041 


A 


1015 


175 


GLKRRHLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARI^FLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKIsfNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQ ED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue oi 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid 

JK— fj 1 1] f J) m IC A riff F—Phanolnlnniiio t~*t ' ww n> . ' 

t.-uiuiamic Acia, r-rnenylalanine, G=Glycine, H=Histidine 
I=Isoleucme, K=Lysine, L=Leucine, M-Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine 
• I" J breonine ' v=Va,ine ' w=Tr yptophan,Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possibIe nucleotide deletion 
\=possible nucleotide insertion 


3043 








KGLSGKRTKTENSGEALAKVEDSMPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTOEEF 
DFVLSLEEKEPS 


3044 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGW1GGA 

ASVIVGHPLDTVKTRLQAGVGYGNTLSCIRWY 

RRESMFGFFKGMSFPLASIAVYNSWFGVFSNTO 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGWSV 

GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEWITPEACTGPSPCAV 

WLAGGMAGAISWGTATPMDWKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3045 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRRNKKRGIFPKVATNIMRAWLFOHL 

SHPYPSEEQKKQLAQDTGLTILQVNNWFINARRR 

IV QPMIDQSNRTGQGAAFSPEGQPIGG YTETEPH 

VAFRAPASVGMSLNSEGEWHYL 


3046 


A 


3 


967 


VAHrQWHTCQRLSQLTHRSJLKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIO 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFlsJRLKK 
KMQPPAAAVTLHLGAHGF 




3047 


A 


1185 


1584 


MYAYMYICTH1CICAYRGIHIDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTHICMCIHTYVYVYTYMY 
VYTYICLCVYICLCVHIYLCVYIHMYMCTHICMC 
[HTYVHMCICVYIHMYTCVYVYTYTCVYMY 


2 




A 


811 


132 

] 
( 
I 
I 
( 
I 


bLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

^TDAHLDINFKEGLKKERSYTGQFEANVRDEER 

3CGCGWPDSLLMKVLSQRLDQQDCIQKGWVL 

iGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

4ERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 
JNPKDAEEOVK7 KMT>T FVBMC a m cm v/->o a n. 

.NGDQDPYTVFEYIESGIINPLPKXIP 




048 / 


I 2 


- J 


166 J 
P 

L 


LPRRGQGL VQE VQTENVTVAEGG VAEITCRLHO 
TDGSrVVIQNPARQTLFFNGTRALKDERFQLEEFS 
RRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
TVLVAPENPWEVREQAVEGGEVELSCLVPRSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanioe OCysteine, D=Aspartic Acid, 
E^Kjlutamic Acid, F=Phenylalanine, G=Glycine» H=Histidine, 
I=Isoleucine, KHLysine, L=Leucine } M»MethJonine, 
N=Asparagine, P=Proline, Q^Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST 

VRFRVDRKDDGGUICEAQNQALPSGHSKQTQYV 

LDVQYSPTARIHASQAWREGDTLVLTCAVTGN 

PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLWYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

nCVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRASEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHEWQAFRAAVATTRGDQESAE 

ANKFQVTDSAATOALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDIKAYLGSAIQL 

VSCLSETTVLAAVLRfflSVLVPCFLTFPKQCRML 

LKRMVVVWSTGEESLRVLAFLVLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVnGCIKLIPTARFYPLRMHCIRALT 

LLSGSSG AFIP VLPFILEMFQQVDFNRKPGRMS SK 

PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPVVLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERLEDLNFPEIKRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 


3050 


A 


870 


182 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGGGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGWLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRNL\TYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEYIACDENPDATY 

DFKSDLWSLGITAJEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRIQLKDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

EEHKRQLLAERQKRIEEQKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP " 

PMLRPVDPQEPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRIEKFDRSSWLRQEEDIPPKVPQRTtSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPBLES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTGAPGSNEQYNVGMVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEARKISVVNVNPTNI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLRVYYLSWLRNRILHhJDPEV 

EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 

NAVEIYAWAPKPYHKFMAFKSFADLQHXPLLVD 

LTVEEGQRLKVIFGSHTGFHVIDVDSGNSYDIYIP 

SHIQGNITPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW 

GEKAmmSVETGHLDGVFMHKRAQRLKFLCERN 

DKVFFASVRSGGSSQVFFMTLNRNSMMNW 


3052 


A 


1 


615 


MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 

KXEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQIAHALKLSEVQVKIWFQNRRAKWKRKAGN 

VSSR^EPVRNPKIVVPIPVHVNRFAVRSQHQQM 


3053 


A 


203 


2167 


FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTWAAVQAIERK 

VEBHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

NFWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEEPTDPSEEPGISTS 

DDLSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

1/^ TOT T T T_T/"\T"> /"^T T A f~\~\ — t~> t*lt~i~i j— it. — . .— . .-. . L _ - l- . _ 

KlbLLLHQRGHA QERPFSCPQCGIDFNGHS ALIRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 
GSGGGVL 


3054 




3 


2212 

] 


SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLN1LHLT 
^ESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucicuuoe 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

IT"— r^lnfo mit* A oiH 17 Vhonvlotaninp CZ=CZ 1 vr i n P Wdl-HcHHi 

Cj — vjiuuiiiiic A.C1Q) t 1 — * nenyiaianinc, vj — vjijciuc, n^niMiuinc, 
I^Isoleucine, K=*Lysine, Lr=Leucine, M-Methiooine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valioe, W=Tryptophan, Y=Tyrosine, 
XMUnknown, *«Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 










APWARASFLCHAFQRPLTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGWDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRSSNLIQHKRVHTGEK 

PYECSDCGKFFSQRSNLIHHKRVHTGRSAHECSE 

CGKSFNCNSSLIKHWRVHTGERPYKCNECGKFFS 

HIASLIQHQIVHTGERPHGCGECGKAFIRSSDLMK 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKIHIRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 


A 


268 


2954 


ARRSSSSQGSAAPTPCQVVEASRDQLVAGPSGK 

MGNREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFITRENCLILA 

VTPANTDLAN SD ALKLAKEVDPQGLRTIG VITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDIDGKKDIKAAMLAERKFFLSHPAYRHIADRM 

GTPHLQKVLNQQLTNHIRDTLPNFRNKLQGQLLS 

IEHEVEAYKNFKPEDETRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKNIHGIRTGLFTPDMAFE 

AIVKKQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERIVANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQLERQVETIRNLVDSYMSIINKCIRDLIPKTI 

MHLMINNVKDFINSELLAQLYSSEDQNTLMEES 

AEQAQRRDEMLRMYQALKEALGIIGDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPTTQRRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPTI 

IRPLESSLLD 


3056 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3058 


A 


3363 


2525 


FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEILWASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLRNLLSTRPHIDKIMSTHGKQIMQAVTLILEG 

EHNIEVKEQTLCILANIADGTTAKDLIMTKDDrLQ 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P«=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X—Un known, *=Stop codon, /^possible nucleotide deletion 
\r=possible nucleotide insertion 




< 






KKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 

T AHVT A 


3059 


A 


679 


167 


SSWPSLSSQMHFPSFHLHVAAHY GRDSFVRLLLE 

FKAEVDPLSDKGTTPLQLAIERERSSCVKILLDHN 

AN1D1QNGFLLRYAVIKSNHSYCRMFLQRGADTN 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 

TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 


3060 


A 






PPLQLDMDPNUYGADGDSCTCAGSCKCKECKCT 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 
A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKRPKKETKKKR 


3062 


A 


1589 


276 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

VIDSSMKM^KAFFRWLWAMLRMTEDHVLPELN 

KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 

KPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWNNKTSlsfLHYLLFTILEDSLYKMCILRRHTDIS 

QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAQF 

YDDETVTVVLKDTVGREGRDRLLVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQYVAGNGFRXVSCVLSSNLRHVR 

VFEMDIDDE WELDES SDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFIFGPLMMLLMHPYAQKRSRYIYVVWVLF 

MOGLFSMYFHMTLSFLGQLLDEIAILWLLGSGYS 

IWMPRCYFPSFLGGNRSQFIRLWITTVVSTLLSFL 

RPTVNAYALNSIALHBLYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLISITFPYGMVTMALVDANYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 


3064 


A 


1 




AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCV1WHSFKPEELMVKTKDGYVEVSGKHEEKQ 

QEGGIVSKNFTKKIQLPAEVDPVTVFASLSPEGLL 

HEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDBPDLLGGNGCLGSVVFSESFLTS 

QILVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLOSWPEEGNVHFFSSGT T F^HPimn^TTTQvn 

HMNSISFYDGDSTSTVAALLIDFKSSLLPHLPVHF 
HGSSWLMIALFPKSKIYQAFYSEVFSLWKQQDN 
SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 
GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 
RTHLPVLLQQAEINTTHRIESDKVnsiVTGLPGCH 
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SEQID 
INO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
£. s= vilutamic Acid, r^rnenylalomne, u^uiycine, h— Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N°Asparagine, P^ProIine, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Dnknown, *=Stop codon, A=possibIe nucleotide deletion, 
V=possible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDVVQALQTHPDSNVKASFTIGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNWFT 

SHTTEQRHPLLVQLQSLBRAANPAAAFILAENGIV 

TRNEDffiLILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSIKPSPFSGNIYHILGKXTCFSDSERTMEVCYNT 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

VFIGCSLKEDSDCDWLRQSAKQKPQRKALKTRG 

MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 

GDKTDFHPLMDQFMNDYVEEANREDEKYNQELE 

QQEYHDLFELKP 


3066 


A 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEKLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGFIWSRYSLVIIPKNWSLFAVNFFVGAA 

GASQLFRIWRYNQELKAKAHK 


3067 


A 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KJEGWSKAAKLQGRKTKEGLIGLLQEGNTTVLVE 

VNCETDFVSRNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSWDFVRFECGEG 

EEAAETE 


3068 


A 


3 


1679 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSVVNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGmVKMGKVLGRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKHNEARAVKAFQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAIWFYTNSIFGKAGPPAKIP 

YVTLSTGGIETLAAVFSGLVIEHLGRRPLLIGGFG 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILAIIAS 

FCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLS 

NFAVGLLFPFIQKSLDTYCFLVFATICITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEKIDSAV 

TDGKINGRP 


3069 


A 


861 


300 


AAGAVVSAMPKAKGKTRRQKFGYSVNRKRLNR 

NARRKAAPRIECSHIRHAWDHAKSVRQNLAEMG 

LAWPNRAWLRKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAEASLPEKKGNTLSRDLIDYVRYMV 

ENHGEDYKAMARDEK^TTYQDTPKQIRSKINVY 

KRFYPAEWQDFLDSLQKRKMEVE 


3070 


A 


325 


2019 


LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPHAMADSRDPASDQMQHWKEQRAAQKADV 
LTTGAGNPVGDKLNVITVGPRGPLLVQDWFTD 



242 



WO 01/57190 



PCT/US01/04098 



SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
Threonine, V=VaIine, W=Oryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










EMAHFDRERIPERWHAKGAGAFGYFEVTHDIT - 

KYSKAKVFEHIGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGNWDLVGNNTPIFFIRDPE^F 

PSFIHSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAWCKFHYKTDQGDCNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVANYQRDGPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA 
NL 


3071 


A 


1 


1187 


SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSVVQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVIEKRJQKYSEGEIRFNLl^IVSDRKMIYEQ 

KIAELQRQLAEEEPMDTDQGN SMLSAIQSE VAK 

NQMLIEEEVQKLKRYKIENIRRKHNYLPFIMELL 

KTLAEHQQLIPLVEKAKEKQNAKKAQETK 


3072 
3073 


A 


103 
57 


2775 

■ 

2415 ] 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDKKNA 

SSRPASAISGQ>n^NHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKEERRRAAVEEKRRQRLEED 

KERHEAVVRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEA VIPICPRS AS CSPIIMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRHHGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVXVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

EElMKRTRRTEATDKKTSDORNGr>TAK*nAT mn 

TEVSALPCTTNAPGNGKPVGSPHWTSHQSKVT 

VESTPDLEKQPNENGVSVQNENFEEIINLPIGSKP 

SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 
VQTQQTAEVI 

PPRVCRDHVCLICWDPIAG1GGSRSTMPALPLDO 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location, 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, 0*Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N»Asparagine, P»ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Tbreonine, V=VaIine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDIvnPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVTFDETLQKCLDSYLRYVPRKFDEGVAS 

APEWDMQKRLHRJSWLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAIKKRRLEDSKLLGDLWQRLSH 

SRKKLMEBFHIILNQICLLPILESSCDNIQGFIEEFL 

QEFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYBLQAVESAWEGVDRRKATDAKDPSV 

IEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSVWEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDKFVQDPAVLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 


3074 


A 


3 


251 


GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 
RKFRELHLMRNE ARKLNH QE V VEE DKRLKLP AN 
WEAKKARLEWELKEEEKKKECAARGEDYEKVK 
LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 
RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 
LLHGTHWSTEEIDRMVIDLEKQIEKRDKYSRRR 
PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 
KQNLERGTAV ! 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RORELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WE AKKARLE WELKEEEKKKEC AARG ED YEK VK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDAI)IDYINERNAKF1S[KKAERFYGKYTAEI 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIYTGGATGIGKAIVKELLELGSNVVI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVNNLVKSTLDTFGKINFLVKNGGGQFLSPA 

EfflSSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSIVNITVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSVVKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 



244 



WO 01/57190 



PCT/US01/04098 



SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=PhenyIalanine, G=Glycine, EMHistidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proiine, Q-Glutamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 




_ .. 






GQSLPPQKKAYLSHLSTGSGHEEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DG ATP AMYLDCISDLRQKEITD GIHS S SDINIL YN 

DAVESCIQDPSAEGLSEEVPVVFEELPWFEDVA 

VYFTREE WGMLDKRQKEL YRD V3^1RMNYELL A S 

LGPAAAKPDLISKLERRAAPWKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRRIKRTYRPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTVILGKYRKRTACTQF1KYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETIVSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 

CDRHniWFKPYQSSNKRLNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEWDTMAWPSGIELASFGNDDILNLARYFECSL 

f 1 lj Y sbbALLEEWLGLKTIAQHLPFSMLCKNALA 

QHCRFPLLSKLMAWVCVPISTSCCERGFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 

RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKD 

CIMEPPERLLYPHTSQEAPGMS 


3079 


A 


343 


1513 


FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVKLLLLGAGESGKSTTVKQMKIIHEDGFSGED 

VKQYKPVVYSNTIQSLAAIVRAMDTLGIEYGDK 

ERKADAKMVCDWSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

aal> i v" 1 iiyDILRTRyXTTGlYETHFTFKNLHFR 

LFDVGGQRSERKKWIHCFEDVTAIIFCVALSGYD 

Q VLHEDETTNRMHESLKLFD SICNNK WFTDTSII 

LFLNKKDIFEEKIKKSPLTICFPEYTGPSAFTEAVA 

YIQAQYESKNKSAHKEIYSHVTCATDTONIQFVF 

DAVTDVIIAKNLRGCGLY 


3080 
3081 


A 

A. 


41 
3 


997 

] 
] 

1996 1 ] 


EARTARELTDGVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP ' 

MYSGTFDCFRKTLFREGITGLYRGMAAPUGVTP 

MFAVCFFGFGLGKKLQQKHPEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

LDCAKKLYQEFGIRGIYKGTVLTLMRDVPASGM 

YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 

GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 

RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 

SVAMKFLNWATPNL 

[MADMEDLFGSDADSEAERKDSDSGSDSDSDOE 
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SEQID 
NO: 


Method 


Predicted 

hpcnnnim* 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E— Glutamic Acid Fc=PhenvIflloninp fi=nivcinp Ft=T¥ictirlinp 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glu famine, R=Arginine, S=Serine, 
T=Threonine, V=Vnline, W=Tryptophnn, Y-Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEI 

KESNART/KWSDGSMSLHLGNEVFDYYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HRKMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MIKKEEERLRASIRRESQQRRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDDDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEETVQSVDIAA 

FNKI 


3083 


A 


3 


921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

KPATPAEDDEDDDEDLFGSDNEEEDKEAAQLREE 

RLRQ Y AEKKAKKP AL VAKS SILLD VKP WDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCWEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLOTGPEWLRALSSGGSITS 

PPLSPALPK YKLAD YRYGREEMLALFLKDNKTP S 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVK31ACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQMIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYE VHDYIRA YLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTOQSNNQQSN 
FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 
LNMGEEETLDDY 


3085 

u 


A 


128 


4050 

] 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESEN WRIFREEQNGEDEDGG WRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD - 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKOEELLRKOEEEAAKWARFFFPaoppt p 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

rWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

3KPSGTTKSLLEIQQEEARQMQKQQQOOOOHOO 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
n uuieoiiuc 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

ITcf^liifamif A r\A T?r=T> hpnirlalnnin<» CLz^ClYvrinP TTdt-IictiHinf* 

il Vjiuiuiiiic Aciu, f^jrncDyiaiuuiDC, v» — vji j tiucj n^nisiiuinC) 
I~Isoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=Proline, Q=GIutamine, R°Arginine, S=Serine, 
T=Threonine, V=Vnline, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNK1WASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 


3086 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 

LEAQEPLCANLVPVPITNATLDRITGKWFYIASAF 

RNEEYNKSVQEIQATFFYFTPNKTEDTEFLREYQT 

RQDQCIYNTTYLNVQRENGT1SRYVGGQEHFAH ' 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRIPKSDVVYTDWKXDKCE 

PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLA3E 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTWH 

NATDGIKGSTESCNTTTEDEDLKVRKQEIIKITEQ 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYIDGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 


3088. 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNHTADHVSPLHEACLGGHLSC 

VKILLKHGAQVNGVTADWHTPLFNACVSGSWD 

CVNLLLQHGASVQPESDLASPIHEAARRGHVEC 

VNSLIAYGGNIDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMQLCRLRIRKCFGIQQHHKITKLVLP 

EDLKQFLLHL 


3089 


A 


73 


432 


DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRVVSYQLSSRSTCLKAGVIFTTKK 
G(^FCGDPKQEWQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKmVGSEVKT^SLDPSTQWSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQ ID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, KNKistidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPVFCRFFHFRRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTKlsfVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTDSPHRQVAWKRAVKG 

VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WLADLTSGNVNKENKEKQPTMPILKNEIKCLPPL 

PPLSKSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTIKPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWN VFREC WKQGQP VMVS G VHHKLNSEL WK 

PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDVPNRLKNEKEPMVLKLKDWPPGEDFRDM 

MPSRFDDLMANIPLPE YTRRDGKLNL A SRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

D AA>TVMVYVGIPKGQCEQEEE VLKTIQDGD SDE 

LTIKRREGK£KPGALWHIYAAKDTEKIREFLKK 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 

EYGVQG WAJTV QFLGD V VFIPAGAPHQ VHNLYSC 

KVAEDFVSPEHVKHCFWLTQEFRYLSQTHTOHE 

DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091. 


A 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYVNGHCTKYEPWQLIAWSVVWTLLI 

VWGYEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPEIVAPQSAHAAFNKAASYFGMKI 

VRWLTKMMEVDVRAMRRAISRNTAMLVCSTP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGVTSrSADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

IKTARFLKSELENIKGIFVFGNPQLSVIALGSRDFD 

IYRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 

MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSWEETDKGSFVGNIAKDLGLQPQELADGGVRI 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

PCLVSF^VEDKMKLFPVEVEIIDINDNTPQFQL 
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SEQID 
NO: 


Method 


Predicted 

hpainninp 

UCglllUlllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nnrlpntide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A«Alanine OCysteine, D=Asparttc Acid, 
E c =Glutaraic Acid, F=Pbeny1alanine, G=Glycine, H==Histidine, 
I»IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
V=possibIe nucleotide insertion 










EELEFKMNEITTPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLILTASDGGEPVRSGTLRIYIQWDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQEFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVTITSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETraSL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEDLYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 

LLAHRLRRWHKSRLLQASGGGLASTPG SHF VG V 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 

TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 

ECISYLEKNNS 


3093 


A 


1 


3868 


PPDNQKLGLLEALLKIGDWQHAQN1MDQMPPYY 

AASHKLIALAICKLIHITIEPLYRSVTSWAVDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKWRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTED 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTILFD 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 

ACILSNCirEALANPEKERMKHDDTTISSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIRNTKKSSQRLKDALLDHDLALPL 

GLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHfflSSKYDELKKSEKGSKQ 

QHKVHKYITSCEMVMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYDLAVPHTSYEREVNKLK 

VQMKAIDDNQEMPPNKXKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLCYDRVFSDHYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTEL 

RATGFDGGNKADQLDYENFRHWHKWHYKLT 

KASVHCLETGEYTHIRMLIVLTKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLEELKESSAKLYINHTPP 
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SEQflD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, INAspartic Acid, 
E^Glutamic Acid, F=Phenyla!anine, G=GJycine, H=Histidine, 
I=Isoleucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










PLSKSKEREMDKKDLDKSRERSREREKKDEKDR 

KERJKJUDHSNNDREWPDLTKRRKEENGTMGVSK 

HKSESPCBSPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 

PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 

ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 

SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 

KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 

DKDRSGYTOEHELDALLKDLYEKNKKEMNIQQL 

TNTYRKSVMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWTLASSWMGLVGTYSCFWTKYMNHLTVHN 

REVLYELEEKRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKGVLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIBLPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 

EAQPEWLRAEVKRLSHELAETTREKJQAAEYGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKKVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 

LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 

ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 

EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 

EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVDINGPEDLACKYHVAVAEAGELREQ 

LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 

VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPMNIYNLIAnRDQKHLQAAVDRTTELSRQ 

RMSQELGPAVDKDBCEALMEEDLKLKSLLSTKRE 

QITTLRTVLKANKQTAEVALANLKSKYENEKAM 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYITQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

i ro v on i ^/\^/iojJKAcO 1 uLAJN V<> Vr UatiKrlol I U 

D 


3097 


A 


1 

• 


879 


MVKWPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQID 
NO 


Method 


Predicted 

Uvglllllll'g 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methtonine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib1e nucleotide deletion, 
V=possib!e nucleotide insertion 










GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TLSALLSWCHLH>^STLINKINNLU)AVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KKJDEAKDERLCENNAEFNKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

ENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 

NE 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRVVRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANnQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LDGVRYLHALGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSVDMWALGVIAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWVVSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A , 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDDEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDWLKKVKHRLV 

ENMSSGTADALGLSRAELCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLNKAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STGNYEYRLILRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPDEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAMEKCKDAGLAKSIGVS 

NFNRRQLEMILNKPGLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P«Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine,W=Tryptophan, Y«Tyrosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










TQVSNWFKNRJRQRDRAAEAKERENTE^NNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKXEQAKNKEDSNIRENSSGAGKTBCRAFD 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTmLNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIYTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMBKPEIIDELLNIEKNPQKPQY 

DCKFENVKWIYDQEAQEFNITHLQQLWANHAV 

KTHMLYSMLQGLDTWVPCGIGPKMDGMTEWG 

NVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 

LESRIQHFVRRGRIEHPFBLFHEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEDCSII 


3104 


A 


227 


1519 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGHIQTALYGKMGRVRSPHPYGH 

RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVWGFSLGGNIVCKYLGETQANQEKVLCCVS 

VCC^YSALRAQETFMQWDQCRRFYNFLMADN 

MKKIILSHRQALFGDHVKKPQSLEDTDLSRLYTA 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE 

NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 

VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWVTKKDCVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPVYPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

H WFFSF VEDPLIDFEVRSQFEGRPMPQLTSIIVNQ 

LKKIIKRKHTLPNYKIRFKPFFPYQTLQGFEEDEE 

HIHIQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 

VPLRQCPG 


3106 


A 


079 




J / A A APAPDT DT1T7AOATTTT1 oT»r»T Tk a t» t~>t i~i .» t-v i ■»-» 

MAAACjAGRLRRVASALLLRSPRLPARELSAPAR 

LYHKKWDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVMKLQIQVDEKGKIVDARFKTFGCGSA 

IASSSLATEWKGKTVEEALTIKNTDIAKELCLPP 

VKLHCSMLAEDADCAALAD YKLKOFPKKGF A F 

KK 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGIVFLETSERMEPPHLVSCS 
VESAAKIYPEWPVVFFMKGLTDSTPMPSNSTYPA 
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SEQ1D 

INU: 


Method 


Predicted 

Uoni nninA 

Dcginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue or 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
iL c =v>iutamic ACiu, r^r nenyiaiamne, \>— v>iyciue, ri^fiisooine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R~Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X«=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










FSFLSAIDNVFLFPLDMKRLLEDTPLFSWYNQINA 

SAERNWLHISSDASRLAnWKYGGIYMDTDVISIR 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNSAIWGNQGPELMTRMLRVWCKLEDF 

QEVSDLRCLMSFLHPQRFYPISYREWRRYYEVW 

DTEPSFNVSYALHLWNHMNQEGRAVIRGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQROTQ 

LMRDLDQRTEDLKAEIDKLATEYMSSARSLSSEE 

KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VDKHERRLDTDLARFEADLKEKQIESSDYDSSSS 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 

QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 

DPNEPTYCLCHQVSYGEMIGCDNPDCSIEWFHFA 

CVGLTTKPRGKWFCPRCSQERKKK 


3109 


A 


1 


2613 


MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNKLGTSAFSEWTVNTLAFPITTPEP 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLRAPSESSDDQGQPAAKRMLSPTREKELSL 

YKKTKRA1SSKKYSVAKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTEPEENGENASNSTLPLT 

QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 

VSQGQLRHTSQGMGIPVLPYPEPAEPGAHGGPST 

FGLDTTIWYEPQPRPRPSPRQARRAEPSLHQVVLQ 

PSRLSPLTQSPLSSRTGSPELAARARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 

ATSPPERALSKL 


3110 


A 


88 


924 


ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 

AEGPEEENEGPEPAKKAGPLGQGALDAVQSLPL 

KNPFYDSSDNPYTRWLASTEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELERRFRQQRYLSAPEREHLA 

SLIRLTPTQVKIWFQNHRYKMKRARAEKGMEVT 

PLPSPRRVAVPVLVRDGKPCHALKAQDLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 


3111 


A 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH 
GTTLPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 


3112 


A 


3641 


1555 


APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P=Proiine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RHT^SNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLIPYPLITKEDINAIEMEEDKRDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERHCNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGIIIPHENSPDQQ 

l^bfcriKPKIGLSLK^ 

DSVFNKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SIVDSILMERRIRPWINKKIIEYIGEEEATLVDLVC 

SKVMAHSPPQSILDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPGLET 
NlLl^TTPNKTPPGADPKQLERTGTVREIGSQAV 
WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHNLQEIRQLELVEPS G WIH VPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELWPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFnGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LIKATVIPNRVKMLPYFGIIRN^^ 

REYYRLLNVEEGCSADEVRESFHBCLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKPKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

V i V JsJVilKQbKQQKITQAIERL VEDLIQESMAKGDF 

DNLSGKGK^LKKFSDCSYIDPMTHNLNRILIDNG 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRINDFNLIVPI 

LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSmr^ASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDOKKFIDOVVEKIEDLLOSFFNKTsrr DT 

EPCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE 

RYWISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQED 
NO* 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Pbeny1alanine, G^Glycine, H=Histidine, 
IHfeoIeutine, K=Lysine, L=Leucine, ^Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginioe, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophao, Y=Tyrosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSBCLIEPFFNKLFLMRVMDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAPGNIQIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSTVGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTOSCAEPLSEGRKKA 

KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQL1JVVNHPGYMAFLTYDEVQERLQACRDKPG 

SYffRPSCTRLGQWAIGYVSSDGSILQTEPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKICAESNKDV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 


3117 


A 


296 


3547 


ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 

QSSIILCFSPVGRTLRVRARKFPAIVNCTAIDWFH 

ANWQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNYTTPKSFLEQISLF 

KNLLKKKQNEVSEKKERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITK1GLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 

NVTAAVMVLLAPRGRVPKDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

IEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THCERWPLVIDPQQQGIKWIKNKYGMDLKVTHL 

GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 

LGRNTIKKGKYIRIGDKECEFNKNFRLILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKINEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFE.SPGVDALKDLEILGKJRLGFnDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EffllPQGLLENSIKITNEPPTGMLANLHAALYNFD 
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SEQID 
NO: 


Method 

- - ■ ■ 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, (^Glycine, HNHistidine, 
I=Isoleucine, KHLysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«Un known, *«=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 


3119 


A 

s 


1254 


4133 


PLATLTMEEQGHSEMEEDPSESHPfflQLLKSNREL 
LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 
QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 
VDLRPWLLEIGFSPSLLTQSKVVVNTDPVSRYTQ 
QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 
LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 
GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 
FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 
VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 
SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 
GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 
ALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQH 
FRAAFEG SPQLPDCTMTLTD VFLL VTEVHLNRM 
QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 
GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 
ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 
TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 
. SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 
RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 
VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 
AARGICANYLKLTYCNACSADCSALSFVLHHFP 
KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 
VNQITDGGVKVLSEELTKYKTVTYLGLYNNQITD 
VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 
LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 
LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 
EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 
IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 
EEAKVYEDEKRHCF 


3120 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTIPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 

LGRLYNKDAVffiFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPWGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAEVCHTCGAAFQEDDVTVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENV VNE YS SELEKHQL YEDETVNSNIPTNLR 

VLRSILENLRSKTOKT F^nv^AOlVTFVrRTPrTVQ 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

nrirt rpctrinp nf 

peptide 
sequence 


Predicted end 

It 114*1 Afxti #1a 

location 
corresponding 
to last amino 
acid residue of 

sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

TT"-^lll#rt milt A nil) f— DU/lniilnlAn!na f~* lllpin/\ T"f Till ll ill ■■ n 

jd— Vjiuiamic acio, t— rnenyiaianine, tj=v»iycine, ti— xiisuuine, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P«Proline, Q=Glutamine» R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y~Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 

\ = no»:**ihlp niirlpntiHp incprtinn 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRXQCSK 
EDGGGWWYNRCHAAWNGRYYWGGQYTWDM 
AKHGTDDGWWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGG WTVIQNRQDGS VDFGRK W 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGVVWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3123 


A 


3 


1490 


HASGPTRPVS WSFHKLKTMKHLLLLLLC VPLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGG WTVIQNRQDGS VDFGRK W 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGWWMNWKGSWYSMKXMSMKIRP 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 


3125 


A 


3 


571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVTENGEIRFNGKGKKIRKJPR 

TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTQTQ\HOWFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 


3126 


A 


43 


5377 


LSVFFPEPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CILRGNFAEAHQ\^FITOLKSSPSSGELMFMERY 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPEPM 

LQEDFW1STALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 




• 






DHVLLNADGIRGFPWLQQISKSLNYLLMSASQT 

KSESVEEKGGGPPRCSITELLQMCWPSLSEDCVA 

SHTTLSQQLDQVLQSLREALELPEPRTPPLSSLVE 

QAAQKAPEAEAHPVQIQTQLLQKNLGKQTPSGS 

RQMDYLGTFFSYCSTLAAVLLQSLSSEPDHVEVK 

VGNPFVLLQQSSSQLVSHLLFERQVPPERLAALL 

AQENLSLSVPQVTVSCCCEPLALCSSRQSQQTSSL 

LTRLGTLAQLHASHCLDDLPLSTPSSPRTTENPTL 

ERKPYSSPRDSSLPALTSSALAFLKSRSKLLATVA 

CLGASPRLKVSKPSLSWKELRGRREVPLAAEQV 

ARECERLLEQFPLFEAFLLAAWEPLRGSLQQGQS 

LAVNLCGWASLSTVLLGLHSPIALDVLSEAFEES 

LVARDWSRALQLTEVYGRDVDDLSSIKDAVLSC 

AVACDKEGWQYLFPVKDASLRSRLALQFVDRW 

PLESCLEDLAYCISDTAVQEGLKCELQRKLAELQ 

WQKD1/GLQSPPVWCDWQTLRSCCVEDPSTVMN 

MILEAQEYELCEEWGCLYPIPREHL1SLHQKHLL 

HLLERRDHDKALQLLRRIPDPTMCLEVTEQSLDQ 

HTSLATSHFLANYLTTHFYGQLTAVRHREIQALY 

VGSKJLLTLPEQHRASYSHLSSNPLFMLEQLLMN 

MKVDWATVAVQTLQQLLVGQEIGFTMDEVDSL 

LSRYAEKALDFPYPQREKRSDSVIHLQEIVHQAA 

DPETLPRSPSAEFSPAAPPG1SSIHSPSLRERSFPPT 

QPSQEFVPPATPPARHQWVPDETESICMVCCREH 

FTMFMIRHHCRRCGRLVCSSCSTKKMVVEGCRE 

NPARVCDQCYSYCNKDVPEEPSEKPEALDSSKSE 

SPPYSFVVRVPKADEVEWILDLKEEENELVRSEF 

YYEQAPSASLCIAILNLHRDSIACGHQLIEHCCRL 

SKGLTNPEVDAGLLTDIMKQLLFSAKMMFVKAG 

QSQDLALCDSYISKVDVLNILVAAAYRHVPSLDQ 

ILQPAAVTRLRNQLLEAEYYQLGVEVSTKTGLDT 

TGAWHAWGMACLKAGNLTAAREKFSRCLKPPF 

DLNQLNHGSRLVQDVVEYLESTVRPFVSLQDDD 

YFATLRELEATLRTQSLSLAVIPEGKIMNNTYYQ 

ECLFYLHNYSTNlAIISFYVRHSCLRrMLLHLLNK 

ESPPEVFffiGIFQPSYKSGKLHTLENLLESIDPTLES 

WGKYLIAACQHLQKKNYYHILYELQQFMKDQV 

RAAMTCIRFFSHKAKSYTELGEKLSWLLKAKDH 

LKIYLQETSRSSGRKKTTFFRKKMTAADVSRHM 

NTLQLQMEVTRFLHRCESAGTSQITTLPLPTLFG 

NNHMKMDVACKVMLGGKNVEDGFGIAFRVLQ 

DFQLDAAMTYCRAARQLVEKEKYSEIQQLLKCV 

SESGMAAKSDGDTILLNCLEAFKRJPPQCCFCSA 

QELEGLIQAIHNDDNKVRAYLICCKLRSAYLIAV 

KQEHSRATALVQQVQQAAKSSGDAVVQD1CAQ 

WLLTSHPRGAHGPGSRK 


3127 


A 


467 


1259 


HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 

PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

KTQLATLTSSLATVTQEKSRMEASYIADKKKMK 
ODLEDASNKAEEERART FGFT KC±l nvm ATtTv a 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 
RQDLELRLEETREALAGRAYAAEQMEGFELQTK 
QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 
ARLKSHFQAQLQQEMRKVIIHISFKHQPLT 


3128 


A 


1854 


798 r 


ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
^Glutamic Acid, ^Phenylalanine, G==GIycine, H=Histidine, 
I=lsoIeucine, K=Lyslne, L^Leucine, M=Methionine, 
N=Asparagine, P=ProHne, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown t *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTWSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLITIWLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 


3129 


A 


2340 


1192 


ELARRPKQQSSEKSRN1V1IRNWLTIFILFPLKLVEK 

CESSVSLTVPPWKLENGSSTNVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAIS1I 

NQVIGWIYFVAWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYIKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLUIVQCCLYERG 

GQRVSWPAIGFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDWFFIQHFCLYRKRPGYD 

QLN 


3130 


A 


3] 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

WGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRVFQFCLRYTKEEEVKRIVSGIIHHTQAP 

KLLKRLFLFSYATAAQNNTVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFWPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLKMSALPKEQDDG1LQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 

EFWDTDIKWFSLLESSSWLDIIRRCLKKAIEITEC 

MEAQNMNVLLLEENASDLCCLISSLVQLMMDPH 

CRTRIGFQSLIQKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKRISKLINSSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLE3MMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSffAITRYWFAATVAVPLVGKLGLJSPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRLETGAFDGRPADYLFMLLFNWI 

CrvrTGLAMDMQLLMIPLlMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 

mVGHLYFFLMFRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 


3132 


A 


2 


350 


FVAG WRAL TAPS TSARLRAFGWQ AAA RLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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SEQDD 
NO: 

3133 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue ol 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue ol 
peptide 
sequence 


Amino acid sequence (A-Alanine Cysteine, D=Aspartic Acid, ' 
fc-ulutamic Acid, F=PhenyIaIanine. G=ClvHnp j¥-iTic*M. na 
I-Isoleucme, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagi n e ( P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
. T=Threon, n£ ,V=Vali n e,W=Tryptophan,y=Tyrosine; 
X-Unknown, *=Stop codon,/=possible nncleotide deletion, 
\-possible nucleotide insertion 


3134 


A 


1 


2921 


M1CFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRAISIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFORH 

AHEQDTKMHEIYKGNITPQLNKNTLKTSAATDV 

WAVYFSQFWDDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTOGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAELALLLHPVDQANTLKSPVSESVSPWPDYLP 

TENGDFLSSKRKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTTNYRGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSVWFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLPvHYLCNRPVGSDO 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

QCHIENFSTEFLTSSLMNIQHFLEDETVATVMPM 

KIQVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

WERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSOSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHKKMTVE 


3135 


A 


9 


1579 


EEEGLSGGGi'K.VPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEGCKVGRGTYGHVYKARRKDG 

KDEKEYALKQffiGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKJCPMQLPRSMVKSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF 

ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANS SLKYMEKHK VKPDSK VFLL 

LQKLLTMDPTKR1TSEQALQDPYFQEDPLPTLDV 

FAGCQIPYPKREFLNEDDPEEKGDKNQQOOONO 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSOS 

3STLGYSSSSOQSSQYHPSHOAHRY 






3 


1111 ] 
] 
( 
$ 
I 
f 

E 


iRKMAEPPSPVHCVAAAAPlATVSEKEPFGKLO 

-SSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

jNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

5SRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

.HAQPHHLLLPAAAAAASANAKSRRPKEKREKP 

IRRHGLGGAREAGGASREENGEVKPLPRDKIKD 
OKERDKEKEREKKKHKVMNE1KKENGEVKJLL 
:SGKEKPKTOIEDLQIKKVKKKKKKKHKENEKR 
IRPKMYSKSIQTTCSGLLTDVEDQAAKGILNDNI 
J3YVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQID 
NO: 


Method 


Predicted 

hpoinninc 

UVgl 11 111 tig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nnrlpntirip 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Clutflmic Acid F=Phenvlalanine, fr^Glvcine. H=Histidine 
I=IsoIeucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamlne, R=Arginine, S=Serine, 
T=*Threonine, V=VaIine, W^Trj'ptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVRMKRA 

YKNKWGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERHTQLEQMFRDIATIVADKCVNPETXRPYTVI 

LIERAMKDIHYSVKTNKSTKQQALEVIKQLKEK 

MKIERAHMRLRFILPVNEGKKLKEKLBCPLIKVIES 

EDYGQQLEIVCLEDPGCFREIDELIKKETKGKGSL 

EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLIKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWVNGVKPGWQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVEPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGWRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIRIGFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HffiQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDKLRAANEKYAQEVAGLKDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTHMIESNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 


3138 


A 


110 


2499 


QDRRLLRLELQKTCQPTSIMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQID 
NO: 



3139 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



3140 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



110 



2499 



4939 



PCT/US01/04098 

Amino acid sequence (A=Alanine 0=Cysteine, U=Aspartic Acid 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine 
Wsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proli»e, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine,V=Valine;W=Trypto P h a n,Y=Tyrosine 
X-Unknown, *=Stop codon, /=possible nucleotide deletion 
V-possible nucleotide insertion ' 



WSK.QLPGLLPN'1'ALTPPTPLVGLC SLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITTVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEySIAGDDSVTEGVTDLV 

RGTLCP ALKALFEHGLKKPSLLGG ACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVhfVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQKCELRVLCCFAFSLSQDWELPAKREAO 

QPLKEGVRDMLVKHHLFSWDVDG 

gDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIBEDLLPASYFSTTLLGVQTDORV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDKLLLRIWDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDEPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTOLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQDCCELRVLCCFAFSLSQDWELPAKREAO 
QPLKEGVRDMLVKHHLFSWDVnn 



S AALGASLAu* RPGLPG VHGRGPG'ILSGRAMEG 
AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 
WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 
AGDEIVGlNDIGLSGFRQEAICLVKGSHKTLKLV 
VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 
SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 
HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 
RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 
LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 
GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 
KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 
EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 
PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 
LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 
LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWO 
GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 
PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 
AEGCSAGAQEPPRASRAEKASQRLAASITWADG 
ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 
PFDAHVGKPTRRSDRFATTLRNEIQMHRAKLQK 
SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^CIycioe, H=Histidine, 
I=Is61eucine, K=Lysine, D=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=XJnknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










AGTYKDHLKEAQAJR.VLRATSFKRRDLDPOTGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHIJILQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRIERVMDNNTWBGVIVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSL ADILDP S VKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATOSTYYSTSAPKAELLIKMKDLQEQQErlEEDS 

GSDLDHDLSVKKQELBESISRKLQVLREARESLLE 

DVQANTVLGAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELBCENLDRRERIVF 

DILANYLSEESLADYEPIFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRVVLAACSPYFHAMFTGEMSESR 

AKRVRIKJEVDGWTLRMLmYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVK^SSACKNYLffiAMKYHLLPTEQRILMK 

SWTRLRTPMNLPKLMVVVGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTYDSYDPVKDQWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGWGGLLYAVGGYD 

GASRQYLSTVECYNATTNEWTYIAEMSTRRSGA 

GVG\HLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 

DGSCKLASVEYYNPTTDKWTVVSSCMSTGRSYA 

GVTVIDKPL 


3142 


A 


1211 


1311 


FSNLTTEKVAHAKEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYJQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKWTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 



264 



WO 01/57190 



SEQH 
NO: 


5 Method 


Predicted 
beginning • 
nucleotide 
location 

to first amino 
acid residue c 
peptide 
_ sequence 


Predicted em 

nucleotide 

location 

correspondin 
? to last amino 
- ^ acid residue c 
»f" peptide 

sequence 


1 

■^.lutamic Aad, F=Pheny)alamne, G=Glycine, H=Histidin<« 

W^leucine, K-Lysiue, ^Leucine, Methionine, ' 
g N-AsparagiDe,P=Proiine,Q=Gl u tamine,R=Araiiin e <fe«: m „. 

^Threonine, V=Va l i„e > W=Tryp t opha^i&„'^ Ser,ne • 
* X-Unknown, *=Stop codon, ^possible nucleotide deletion 

\=possible nucleotide insertion - ^ueienon, 


3144 


A 








^14T~ 




/o 


604 


^VSGIVLDLLl'yLHFLSNMNLDGSAQDPEKREYS 
WCVGl^DDKKSERMTAVVHDREWI^GE 
YHAMDIRCYHSGGPLHLGDIEDFDGRPCrVCPW 
HKYKITLATGEGLYQSINPKDPSAKPKWCSKGK 


3146 


A 


"T 


~~333~ 




3147 


A 


3 


1151 


I^^;^ GTRSTLLliCLDSGFRp GASRGLVGSW 
AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 
LGDPKILFLFYFPAAYYA^^VGIA^MSLITEW 

lnlifkwflfgdpj>fwwhes^sqSwo 

FTSSCETGPGSPSGHCMTGAALWM^SSW 
ATTlARSRWVRV]Vn>SlAYCTFLLAVGLS^iAH 

fphqvlaglitgavlgwlmipr^Sy^ 

LTALALMLGTSLIYWTLFTLGLDLSWSISLAFKW 
I P^EWIHVDSRPFASLSP^SGAALGLGIALHSPC 

SqS™ LKYTLWCLV ^ wwa ^ 


3148 


A 


1437 


594 


K5>i- SLSFSLLbl'&iliMMALGAAGATRVFV AMVAA 
AmGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 
GHPGSAVSAAPGILYPGG^YQTroNYOPYPCAE 
DEECGTOEYCASPTRGGDAGVQICLACRKP^ 


3149 . A 




1 


1562 

; 

] 

> 

( 

c 

E 


^t™ iKA ^ QLL ^' J, ' ASSDS NKALEQRRTLH 

TPKLEHLDRVLYEWFLQKRSEGVPVSGPMLffiK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGKK 

lda^ekqsadhqaaeqfcaffr£™gS 

n^^ ETG ™ CLP ^ E GGAVPGPKQGK 

drltvlmcanatgshrlkplaigkcsgpp^Skgi 

QHLPVAYKAQGNAWVDKEIFSDWFHmF^SVR 

MGLPEDSKAVLIXDSSRAI^Sss^ 

^LPASVASLVQPMEQGIRRDF^INPPW 

^QGPHARYNMNDAIFS VACA WNA VPSHVFRR A 

^WPSVAFAEGSSSEEELEAECFPVKPHNKSF 

pnLELVKEGSSCPGQLRQRQAASWGVAGREAE 

grppaatspaevvwssektpkadqdgrgSSe 

JEEVAWEQAAVAFDAVLPJ?AERQPCFSAQEVG 




1 


32 4 

i 


125 \ 
P 
G 

IV 
A 

1* 


A VMISTAPL YSGVHN WT<5<: np TPAyrrvrvn-^T^ , - 

lsdeesttgdcoSgSSS^ 

SGSNARGADPDGSATEKLGHKSEDKPDDPOPK 
IDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 
SNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 
QGVGEKNTTILATLGTGVPVEGnj'LVrn^ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartic Acid, 
E^GIutamic Acid, F^Pbenylalanine, G=Glycine, H=Histidine, 
I=Iso!eutine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^erine, 
^Threonine, V=*VaJine, W^Tryptopnan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYBPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPWASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPAR1AJPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDWFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQEPPE 

ARRLIVNKNAGETLLQRAARLGYKDWLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDELNILLE 

HGA 


3150 


A 


3 


2795 


SLRMHNLSIL\01QIKFYYQEI1X3QLIMMSLPNVLI 

IGKOTFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 

IERIQGLDFDTKAAVAAHIQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKNMALHLKRLIDERDEH 

SETIIELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSAKN1YR 

DELDALREKAVRVDKLESEVSRYKERLHDEEFY 

KARVEELKEDNQVLLETKTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

IETLRENSERQIKILEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKILHESIKETSSKLSKIEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEKIEALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEKEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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NO: 



Method 



[Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 



Tredicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 



PCT/US01/04098 



acid residue of- peptide 



peptide 
sequence 



sequence 



3152 



2515 



2645 



Amino acid sequence (A^AIanine C-Cysteine, 0=Aspartic Acid 
E-Glutamic Acid, ^Phenylalanine, Glycine, H^dine 
Msoleucine, ^Lysine, ^Leucine, Methionine, ' 
N-Asparag,ne, P=Proline, (HSIutamine, R=Arginine, S=Serine, 
T-Threonme, V^Valine, W^Tryptophan, Y=T/rosineT 
\. n „ e nk " own ' ^SfJP codon » ^possible nucleotide deletion, 
V-possible nucleotide insertion - 



^SLEQETSQLEKDKKQLEKENKKLRQQAE^ 
mEENNVKIGNLEKEMCTLSKEIGIYlS 

nS^ L ^ TroK ^ V ^ DL VSE^ 
QQM^LEKL^ 



G^WLHLTLLGASLPAALGWMDPUTSRGPDVGV ' 
GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 

srgsrcvlsrktgepecqcleacrpsyvpvSd 

?^^^^ CLLGKRITVIHSKDCF LKGD 




^ ^^ PESQAQEPGVAASLRCH AEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 

r^ YTCIAKNEVGVDEDISSLF IEDSARKTLAm 

lwreeglsvgnmfyvfsddgiivihpvdceiorh 

Ri^YIYVAQPALSRVLVVDIQA^VLQSIGVDPL 
PAKLSYDKSHDQVWVLSWGDVHKSRPSLOVITE 

astgqsqhlirtpfagvddffppSsS 

F^SCTAVHKVDLETMMPLKTIGLHHHGCWQA 

MAHimGGYFFIQCRQDSPASAARQLLVDSVTO 

SVLGPNGDVTGTPHTSPDGRFIVSAAADSPm^ 

TOTYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 

GP^QPWGGTHRIMRDSGLFGQYLLTPARESLFLI 
NGRQNTLRCEVSGIKOfirrWw^, LtU 



UAOWQVSLTGRWSPGREAGAGEVRODPGSTAA" 
SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

aarggpgrrdgrgggprstaggvaTaWsl 

ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWFF 
^GLSFGRQmQDGALRLTTEFVK^GGQSaGD 
^W^PQDSGTSALPLVSLFFYVVTOGKEV 
^ E I G ^ G Q™ GH ^ELGDFRFT11J.PKPG 
DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 

fqj^pgasperylglpgslkwedSwg^ 

QFLr W VTLKIPISffiFWESGSAQAGGNQA£pS 

gslltqaleshaegfrerfektfqi^kgISct 
qvlgqaalsgllggigyfygqglvlpdigvegsp 

qlwqrwdpsltrealghwlgllnadgwigre 

QILGDEARARVPPEFLVQRAVHANPP1LLLPVAH 
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SEQID 

NO: 


Method 


Predicted 

hetrinninp 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCysteine, D=Aspartic Acid, 
F^filntnniir Arid F^Phenvlalnnine G*=GIvcinc HsHicridin* 

Aj^^j i u la iliil. null) Mr m uwuj'hiuuiii^) vi \JiJvi ti ii JHollUIIICj 

I=Isoleucine, K=Lysine, L=Leucine, {^Methionine, 
N=Asparagine, P^Proline, Q=Glutamlne, R^Arginine, S=Serinc, 
T=Threonine, V^Valine, W^Tryptopban, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possib!e nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNY1ALGALHHYGHLEGPHQARAAKLHGE 
LRANWGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVLLADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRWEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPEDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTMSGWNPPPGNRKMHGDLMYLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 

LRERAIFKVHSDFTAAATRGAMAVIDGNVMAIN 

PSEETKMQMFrWNNIFFSLGFDVRDHYKDFGGD 

VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 

VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 

TVVSHPRYLELLERTSRPLKELRHQVLNDRDEEV 

ELCSSVECKGHGNDGRHYILDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FXHEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAETIAADDGTDPRSREVIRNACKAYGSISSTAF 

DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 

AFLLSCQIPGLVKDCMEHAVLPVDGATLAEVMR 

QRGINMRYLGKVLELVLRSPARHQLDHVFKIGIG 

ELITRSAKHIFKTYLQGVELSGLSAAISHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 

TAWAVMTPQELWKNICQEABCNYFDFDLECETV 

DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPAFTEEDVLNIFPWKrrWPKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETCACLRLLARLHYIMGDYAEALSNQQKAVL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 

HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYTIYKTQL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 

NGSSA1OTPLKFTAPSMASVLEQLNVINGILF1PLS 

QKDLENLKAEVARRHQLQEASRNRDRAEEPMA 

TEPAPAGAPGDLGSQPPAAKDPSPSVQG 


3154 


A 


416 


4082 


KFKLDCIMLLTLIILLPVVSKFSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQLWDAGVSVIMDFHYNEKRJYWVDLERQ 

LLQRVFLNGSRQERVCNIEKNVSGMAINWINEEV 

IWSNQQEGnTVTDMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YlX5GSVmSKHPTQHNLFAMSLFGDRIFYSTWK 
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nding 
mino 
lue of" 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide "" 
sequence 



PCT/US01/04098 



212 



1585 



601 



Am.no acid sequence (A=Alaninc OCysteine, P=Aspartic Add ' 
E=Glutamic Add, F=Phenylalani„e, Glycine, H=Hisfidin ° 
I=Isoleuc.ne, K=Lysine, L=Leudne, M=Methio»ine, 
N-Asparagine, P=Proline, Q=Glutamine, R-Arginine S=Serine. 
T-Tbreonine.V-V.Hne, W=Tryp to phan, Y-I^S"t 
X-Unknown, *=Stop codon, /possible nucleotide deletion, 
\=possible nudeotide insertion ~ " 



MKTI W1ANKHTGKDMV R1MLHSSFVPLGELK W 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENKJYFAHTALKWIERANMDGSQRERLffiEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKnTffiNISQPRGIAVHPMAKRLFWTDTGINPREE 

SSSLQGLGRLVTASSDLIWPSGITIDFLTDKLYWC 

DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLOG 

SMLKPSSLWVHPLAKPGADPCLYQNGGCEmC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHOL 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 
QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 
NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 
CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 
PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 
CMYffiALDKYACNCVVGYIGERCQYRDLKWWE 
LRHAGHGQQQKV1WAVCVWLVMLLLLSLWG 
AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 
rn^o PQPWFVVIKEH Q DLKNG GQPVAGED 
GQAADGSMQPTSWRQEPQLCGMGTEOGCWIPV 
SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 
SLLSANPLWQORALDPPHQMELTO 



uisiiWY WERLAERRGRLWSREEAMATMENKVI 
^™ SMLALGTLAEA( 5' reTCTV APRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 



PR V KAAD V AAU AQA WS AGMAK SN GENGPRAP 
AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 
NGFAERRIDKFGFIVGSQGAEGALEEVPLEVLRO 
RESKWLDMLNNWDKWMAKKHCKIRLRCOKGI 
PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 
GDPKWLDVIERDLHRQFPFHEMFVSRGGHGOOD 
LFRVLKAYTLYRPEEGYCQAQAP1AAVXLMRMP 
AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 
FSRTLPWSSVLRVWDMFFCEGVKDFRVGLVLLK 
HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 
LVQEVVELPVTERQIEREHLLQLRRWQETRGELO 
CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 
^ PGS . KAKPKPPKQA Q KE Q RK Q M KGRGQLEKP 
PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 
QDLAPQVSAHHRSQESLTSQESEDTYL "~ * 



SSAMGSRSSHAAVIPDGDSIRRETGFSOASLLRLH 
HRFRALDRNKKGYLSRMDLQQIGALAVNPLGDR 
IIESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 
QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 
HEMLQVLRLMVGVQVTEEQLENIADRTVOEAD 
EDGDGAVSFVEFTKSLE KMnVBfflf mctp rty 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>Glutamic Acid, F-Phenylalanine, G=Glycine, H-Histidine, 
I=Iso)e urine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V*=Valine, W=Tryp*0P han » Y=Tyrosine, 
X«Unknown, *=Stop codon, /=possibie nucleotide deletion, 
\=possib!e nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEG SMSTLSNFTQTLEDVFRRIFITYM 
DNWRQNTTAEQEALQAKVDAENFYYVILYLMV 
MIGMFSFIWAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 


416 


PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 


3160 


A 


179 


409 


KPKTKILKM V Y YPELF VWV SQEPFPNKDMEGRL 
PKGRLPWKEVNRKXNDETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

roGRRKIAFAITAIKGVGRRYAHVVLRKADIDLT 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKJDGKYSQVLANGLDNKLREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 


3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPGYPKPITVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRNILRD 

WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 

GSWAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 


3163 


A 


1235 


2223 


SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPET1ACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHGSPHLKAKHTRDDLKSSNRHGHKRKKSRS 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 


3274 


DCRLQAAMPTNFIVVPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGWEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWrVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGVVPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide — 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^GIutaraic Acid, F=Phenylalanine, G=G)ycine> tt=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y^Tyrosine, 
X=Unknown, *=*Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion — 


• 








GAAJFQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALVVFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH , 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSEPTGTILAIVTTSFIYLS 

CIVLFGACIEGWLRDKFGEALQGNLVIGMLAW 

PSPWVTVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGELI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASAVKQED 

OTFSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHmVWWTVHIXjGMLMLLPFLLR 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRISAEVEVVEMVENDISAFTYERTLMMEQRS 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQLVtXNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREVITIYS 


3165 


A 


3 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVEWPGELVGEQAANQPAPGHPNSINF 

YSLKQ WGNELKNSMS SFRPGRGHNDSRRTVF YT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IEKLSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KXLIRAGIPHErlRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

Jti i bur 1 abtjlC^KLKN VLLAF SWRNPDIGYCQGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFLVVFVDSVVSDILFKJWDSFLYEGP 

KVTFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 

RTDLDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQH) 
NO: 


Method 


Predicted 

hppinninp 

■J Vg 111 M ■ Ug 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D^Asparbc Acid, 
E=Glutnmic Acid F=Phenvlalanine* G=Glvcine. H=HUtidine 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 











TRAFADLLVERQTGQQDSDPYSPVTOQIIJEMVN 

GQRGLVLYYSLAAGYLYSWLLAPGAGIVKFHEH 

YLGENTVENSSDFQASSSVTLPTATGSALEQHIAS 

VREALGVESHYSRACASSETESEAGDIMDQQFEE 

MNNKLNSVTDPTGFLRMVRRKhH^FNR 

LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 

IAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 

ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 

RKNPPTYSSSTSMAAVIGOTKLPSAVMDRWLWG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATfflSWKLSALVLTPSMDGNPASS 

KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 

VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 

AFYSSLLNGLKASAALGEAMKVVQSSKAFSHPS 

NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 

ARDALRVLLHLVEKSLQRIQNGQRNAMYTSQQS 

VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 

LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 

EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 

VLCEVGQEEVILKTGKQANRRTVHFALQSLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENRNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 

SPTTSEMSIKDSPSQHSGRPSPGCDSQTSQLDQPL 

FKLKYPSSPYSAfflSKSPRNMSPSSGHQSPAGSAP 

SPALS YS S AGSARSSPAD APDIDKLKMAAIDEKV 

QAVHNLKMFWQSTPQHSTGPMKIFRGAPGTMTS 

KRDVLSLLNLSPRPNKKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 


762 


AARRRQKGK£ENMMMDLFETG S YFF YLDGEN V 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEELRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEWEK 


3168 


A 


701 


246 


TSRRVTMKFNPFVTSDRSKNRKRHFNAPSHVRR 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKWQVYRKKYVIYIERVQREKANGT 

WHVGIHPSKWITRLKLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMQE 


3169 


A 


156 


3168. 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVWFGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQID 

NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A B Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M-Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSWNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLEEI 

LSEKAGIIQDTWHKATQKGDPVAILKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQEITAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRXALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLEEAQTMEALLAL 

LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYABCEVAGLRQLLLESQSQL 

JJAAiv^JbA^Jvv^oJL/r/L/ALr V K^^J^oJDJVUvorl V -bLHjLJi 

AGAPASSPEAPPAEQDPVQLKTQLEWTEAILEDE 
QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 
ES SETEE A S QLKERLEKEKKLTSDL GRA ATRLQE 
LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 
EGTSV 


3170 


A 


6730 

; 


4027 


THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEBRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKRILGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQWISNAG 

HLl^TYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 

KHNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEJLRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVIILKH^ 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPLH1LRAJYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMHLPCLSRPARCDQATAESNPVT 

r\T7"T TCOTTTCThT r\ACVA VT^DOCFC A ATT T-TVTJT XT/^TZ 

SKRAVRDYLFRVNEATAVLYARHVT ASLLAEWP 

SHWVSEDILELSGPAHMTYILDMFMQLEEKHE 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFQLNQKALSPSSQFPSAEELRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQID 


Method 


Predicted 

kantnnlticr 

Deginning 
nucleotide 
location 
corresponding 
to first amino 
ncid residue of 

dull I vm 

peptide 
sequence 


Predicted end 

rmrlpfiHrlp 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 

Pj=f^hitamif> Ai»ir1 P=Phpnvlnlnninp f^=01vrinp H=HicHHInp 
nr^vyiuiuiiiic /\cm, r— r uciiyiiiiiiiiiiicj \r~ vjijciiic, n~-mauuinCj 

I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 

N=Asparagine, P=Proline, Q=Clutamine, R=Arginine, S=Serine, 

T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 

X=Unknown, *=Stop codon, A=possible nucleotide deletion, 

\=nossib!e nucleotide insertion 










Q ALARFY CYTERTIAKRL VLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHS1SDRLPEEKMQTQGSSNQ 


3172 


A 


2 


496 


FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDWMPKNYLWLTIVSCFC 

PAYPIMVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVAIASIIIGLLIIGISCAVHFTRNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPENAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPWE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APBEQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKILAANPEAKSTSAIL1ENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAAN1LGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQIFCSELTTICCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNIEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETOKVNE 

LMDNIIKEDVNSMQIFTKLSETIVPPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVBCEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTWKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCWLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLKFSP 

EKKXKRCKYKIEKTETIKPEEPLHPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 


3174 


A 


485 


4668 


RKCSKEKASKTPSQK1PTTPCCVLQAGPEPRSLAE 
RMGADGETWLKNMLIGVNLILLGSMIKPSECQL 
EVTTERVQRQSVEEEGGIANYNTSSKEQPWFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQ ID 
NO: 


Method 


Predicted 
, beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide - 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny!alanine, G=Glycine, HNHistidine, 
I=Isoleucine, K-Lysine, LHLeucine, M=Methionine, 
N=Asparagine,P=ProIine, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possib!e nucleotide insertion - - » 













DYBPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCV CEEG YQGPDCS A VAPPEDLRVAG 

ISDRSBELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYIVNVVALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWTPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTh^SDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKWYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLIWTKASGPID 

HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YIISVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSWPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGETILVIXjVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEffiNYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

WADYRVGFGNVEDEFWLGLDNIHRITSQGRYEL 

RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 

SLQF 




3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

AATAEGTMASGVTVNDEVIKVFNDMKVRKSST 

QEEIKKRKKAVLFCLSDDKRQIIVEEAKQILVGDI 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 

SKKEDLVFIFWAPESAPLKSKMIYASSKDAIKKK 

FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 
LEGKPL 




3176 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 
ADEGSIFYTLGECGT T^F^DVTFT ttvt QTT»npxrPT3 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRim)l^TTGNTLKSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGIV1LLAYSGVQSKKLTAMQRQLKKHFBCEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 



275 



WO 01/57190 PCTYUS01/04098 



SEQID 
NO: 


Method 


Predicted 

tSVgllllJllIg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D^Aspartic Acid, 
E=<"Slutamic Acid F=Phenvlnlnmne C=nivcinc H=Hi«;ririinp 
I=lsoleucine, K=Lysine, L^Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V=»VaIine, W=Tryptophan, Y=Tyrosine, 
a— un Known, =otop coaon, /^possiDie nucieouoe deletion, 
V=possible nucleotide insertion 










LDKVTMQQVARTVAKVELSDHVCDWFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 


182 


648 


LGVVGSGAAVGGRQAARGAALGRRPMAAVLG 

ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 

RNQADLDLKAAGWSELSERDAIYKEFSFHNFNQA 

FGFMSRVALQAEKMNHHPEWrTSTVYNKVQITLTS 

HDCGELTKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPTFRIRLGAHRGGSGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRVVVVSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIRELAGLVRRYDRNEITIWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHERNKILTGADGKN 
LTKNDLYPNPKPEVLHMIYMRALQIWGIRLEOT 
YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 
CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 
RETYMEFLWQYKSSADKMQQLNAAHQEALMK 
LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 
QKTIVLQEGNSQKKSNISEKTKRLNELKLSWSL 
KEIQESLKTKIVDSPEKLKhTYKEKMKDTVQKLK 
NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 
IQDLSDNREKLASILKESLNLEDQIESDESELKKL 
KTEENSFKRLMIVKKEKLATAQFKINKKHEDVK 
QYKRTVIEDCNKVQEKRGAVYERVTTINHEIQKI 
RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 
. GIEKAAEDSYAKIDEKTAELKRKMFKMST 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVTILSYPPYEQHECHFPNKAMPSAG 

TLPWVQGnCNANNPCFRYPTPGEAPGVVGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVELHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDELKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYCNDLMKNLESSPL 

SREIWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAVFfTOLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDW 

EQAIIRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMRIMGLDNSILWFSWFISSLIPLLVSAGLLWI 

LKLGNLLPYSDPSVVFVFLSVFAVVTILQCFLIST 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R-Arginine, S^Serine, 
T=Threonin e, V=Valine, W=Tryptophan, Y=Tyrosi ne, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










RDGMKVAVDGLALNFYEGQITSFLGHNGAGKTT 

TMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 

VCPQHNVLFDMLTVEEHIWFYARLKGLSEKHVK 

AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 

VALAFVGGSKWILDEPTAGVDPYSRRGIWELLL 

KYRQGRTIILSTHHMDEADVLGDRIAnSHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTDD 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA 

KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDDAADPNDSDIDPESRETDLLSGMDGKGS 

YQX^GWKLTQQQFVALLWKRLLIARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGG 

LPPPQRKQNTADILQDLTGRMSDYLVKTYVQIIA 

KSLK^IKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFNNKGWHAISSFLNVIN^ 

NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

mFICFQQKSYVSSTNLPVLALLLLLYGWSITPLM 

YPASFVFKIPSTAYWLTSVNLFIGINGSVATFVL 

ELFTDNKLhWINDILKSVFLIFPHFCLGRGLIDMV 

KNQAMADALERFGENRFVSPLSWDLVGRNLFA 

MAVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQR1LDGGGQNDILEIKELTKJYRRK 

RKPAVDRJCVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNRNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAmKLGLVXYGEKYAGNYSGGNKRIG.STAMA 

LIGGPPVWLDEPTTGMDPKARRFLWNCALSVV 

KEGRSWLTSHSMEECEALCTRMAIMVNGRFRC 

LGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDF 

FGLAFPGSWKEKHRNMLQYQLPSSLSSLARIFSI 

LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V 


3181 


A 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYK1HENGFFKDR 

HWLFTEFPE1APSQNQNHLKDWFLENKSEVPEC 

R1WEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPILQTNKDPGLFVYCCDFSSTAIELVQTNSEYDP 

SRCFAFVHDLCDEEKSYPVPKGSLDmLIFVLSAI 

AQLRPKKGQCLSGNFYVRGDGTRVYFFTQEELD 
TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 
WIQCKYCKPLLSSTS 


3182 


A 


3 


1289 


GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQID 


Method 


Predicted 

ucginniug 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucicuiiut. 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

ITjcrfliifamir ArSH EmPhpnvlnlnninP f2=f"Ilvrinf» H~H>criHinf» 
Hr^VvlUm JI1IC rVCIUj tr i UCUyiUlUIJiif Vir— Olj'ClllCf JTl rlloLlUlilCj 

I=Isoleucine, K=Lysine, L=>Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=TJn known, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










AEEENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEVVMNSQQTPVGTPKDKRVSNTPLRTV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMIFMGYQNVEDEAETKKVLGLQDTITAEL 

WIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKXHRCKCCSIM 


3183 


A 


333 


1931 


IAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DVWPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEEAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

RKRLESQREAEWKKEEERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFVVEEEGVGETMTEE 

QSQSFLTEFINYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDLLAEGTITGVIDDRGKFIYITPEELAAVA 

NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 


CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLLTID 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RFWILPQLW1GINFDRLTLLALFDRNREILENVLA 

VILAILVAFLGSILLIQGFFRDIWVFQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 

LIWLLDYGSRNLTATKFKLYGITFTNPLVFISARD 

LVIVFTLCFPIVFFIGLLPQVNTFVMYLCEQLDIHI 

FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 

KDSWDGQHIPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLWCIVIGVLWAIHVSTVFTVLQ 

PALKYVLYTLVGFVGFVTHYVLPQVRKQLPWH 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLFVEKNHYPLIVLNELSSSAETIASPKKLNTELG 

ALMTVAGLKLLRSSFSSPTYQYVTVIFTVLFFKF 

DYE AFSETMLLDLFFMSILFNKL WELL YKLQFVY 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, HNHistidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 




• 






TYIAPWQriWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VIVTKYILEGYSITDNSAASMLQWDLRKVLTTY 

YWGIIYYVTTSSKLEEWLANETMQEGLjRJLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCSSRRAKPVDVDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVVVPGIRMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLRNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHIDKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3187 


A 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 
DGVKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 
IQSKSSKA WHGILMGVPVPFPIPEPDGCKSGINC 
PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 
KNQSLFCWEIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQIAQLETALKSDLTDKTEELDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKINKDYQMEVEAVTRKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDIAYGTK 

v«; I JSJr jSJrn, uvLrUUo V Ucr uc, l li±LbKubNLJr dIHIN 

KVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTP 

WRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 

EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 

LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 

AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AJanine OCysteine, D=Aspartic Acid, 
b=Olutamic Acid, F^rnenyiaianine, u=oiycine, ri^nistidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methioniue, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, * e =Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAIIPSSNDPQFDDHMYFPWMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRCISGIFELTDHQKHPAGTIHVILKWKPA 

YLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKNIKQPSEBORmnALSLNDSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAKRDDLKAILQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 


3189 


A 


476 


1175 


MKGSGWHLRSGMVGTLITTILPHWRRTAHVGTN 

ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDVVQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRATTTTANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

GSFLKELEKSKFLPSISTKENTLSKSLEEKLRGLS 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 

DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKWTRSQEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSALRKNGFVVLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTHNMDVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 


3192 


A 

- 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 

WTGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 

ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 

GRRIPKDWEEFSDLYNEVYNLTQEFFRHDKPVN 

AESQNSVGWTREEVRNRIRNDPDDPEATKRLKL 

AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 

GAHHHPSGFMRVVELLAEGIPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEQWSVWECEDCELIPADHVIV 

TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 

DKIFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 


3193 


A 


1 


1928 


QLGTRRCLRGDKVTOAMQDFLVT'NLEPRFIEPQT 
ANLSVWKDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WWFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P^Proline, Q^GIutamine, R=Arginine, S^Serine, 
Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGHIQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWMAKYPVLYEESMNTVLVQEVIRYNR 

LLQVITQTLQDLLKALKGLWMSSQLELMAASL 

YNN rvPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG 

LFLEGARWDPEAFQLAESQPKELYTEMAVIWLL 

PTPN1UCAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 


1023 


DGWTPVHAAVDTGNVDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPVVPADLINH 

ANREGWTAAHIAASKGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLEMLNALKIPLRIS 

VGEffiPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQAISSDGWWSLEDVTGSJNTTDS 

NIGLSARSIiRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMML 

QNYLRLVEQYHNVIFHGPEGSLQDYTVHQLALCL 

KHRQMGWQDSPVEIYEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEBCNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRfflSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADLIQHYIfflTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEKPYECVECGKJVFNRSSHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLlRrlSIIHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRNPTIVTDVGRP 

FMTAQTSVMQELLLGKEFLNITTEENLW 


3196 


A 


1400 
i 


264 


VGF WERPLRS SR WFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTWLFVPQQ 

EAWVVERMGRFHRILEPGLNILIPVLDRJRYVQSL 

KEIVINVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAWQLAQTTMRSELGKLSLDKV 

FRERESLNASIVDAINOAADCWGrRrT R YFTKnrp 

WPRVKESMQMQVEAERRKJRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTTLLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQID 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

ntirlpfitiHp 

UUtlCUIlUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
F^ftliitnmic Arid F=Phenv!fllanine G=*Glvcink. H=Hi<;tirtinp 
I=lsoleucine, K=Lysine, L=Leucine, M-Methionine, 
N^Asparagine, ^Proline, Q=G1utamine, R=Arginine, S=Scrine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNF>JKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQffiSVKFLLSRGANPNL 

RIvTFNMMAPLHIAVQGMNNEVMKVLLEHRTIDV 

NLEGENGNTAVIIACTTNNSEALQILLNKGAKPC 

KSNKWGCFPIHQ A AFSG SKECMEIILRFGEEHG Y 

SRQLHINFMNNGKATPLHLAVQNGDLEMIKMCL 

DNGAQIDPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIWTTDGCHETMLHRASLFDHHELAD 

YLISVGADINKIDSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEVVLTIIRSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMT1LVVNIKPGMAFNSTGIINETSDHSEI 

LDTTNSYL3KTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWnYTTGIIFVLPLFVEIPAHLQ 

WQCGAIAVYFYWMNFLLYLQRFENCGIFIVMLE 

VELKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSnQTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTIFVPIVLMNLLIGLAVGDIAEVQKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLIIQKMEII 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


3198 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQEPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

WATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKfflTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRIHSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPFKCEFCNWCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRIHERIHCTVIIPFKCNYCS 

FDSKQPSNLSKHMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKUVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSL1APPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTVVSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSDQGAAHPALLCPADSffD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEG AS SHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRVVGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AWLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGVVGGEMTTLVLDNGAYNAKIGY 

SHENVSV3PNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNIHTEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCCIIVDSGYSF 

THIWYCRSKKKKEAIIRINVGGKLLTNHLKEIISY 

RQLHVMDETHVINQVKEDVCYVSQDFYRDMDI 

AKLKGEENTVMIDYVLPDFSTO 

LSGKYKSGEQDLRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKNrVLTGGNSLF 

rur KDKV i bfc VKCL 1 r 1 Dxu Vb V VLPENPITYAW 

EGGKLISENDDFEDMWTREDYEENGHSVCEEK 

FDI 


3200 


A 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 
SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 
IvUv V w i AiriKJN V r riJLl^L, 1 OL W ILINLUyjLClrN 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 
YLRVFFRThnDAKVGTLVGEDKYGNKTYYEDNKQ 
FFGRHRWVVYTTEMNGKNTFWDVDGSMVPPE 
WHR WLHSMTDDPPTTKPL TARXFI WTNHKFNVT 
GTPEQYWYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 
ry WKVbArlilJNJNIV VrJblNr WbvxL WMNCVRQANI 
RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 
AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFII 
TGMVVLIPVSWVANAIIRDFYNSIVNVAQKRELG 
EALYLG WTTALVLIVGG ALFCC VFCCNEKSS SYR 

VQTPQTTRT^O'K'QVTITrili r l^QT>Q'V/'VCT? CAVA/ 

I ouroruv.1 1 l^Jvo I ri 1 UrjsJxoJro V i oKo^ Y V 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 . 


A - 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYWTRTISCHV 
QNGTYLQRVLQNCPWPMSCPGSSYRTWRPTYK 
VMYKIVTAREWRCCPGHSRVSCEEVAGSSASLE 
PMWSGSTMRRMALRPTAFSGCLNr^lCV^PT TFT? 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQID 

MA. 


Method 


Predicted 

ueginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

UUCICOUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D»Aspartic Acid, 

ITczf^lii ^iniip A t*i A tT=Phpnvlnlnninp f^srfllvrinp H=Wict?Hir>» 

I-Isoleucine, K>=Lysine, Lr=Leucine, MHVtethionine, 
N=Asparagine, P»ProIine, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possiDle nucleotide deletion, 
\=possible nucleotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPENDRMRM 

KYGGQEFWADLNAMKVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVIRLIEE 

ANSRGLKEVRFMMWhn^HYILHNSFFRREIKJRRP j 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YREYNLFHKTVPEFKYRILQ1LRVQNQFLWEKY 

KRKKEYMNRKMFGRDRIINERHLFHGTSQDVVD 

GICKHOTDPRVCGKHATMFGQGSYFAKKASYSH 

NFSKKSSKGVHFMFIAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEEmMSmKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHIEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLIEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKRFSSEITHKDWQQNNTLIEEMLYL 

IIMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVIEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHW TFIYVQK1SKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFroEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

N1FNNRLNFSDQPNLTQWIRTISQQIKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKBPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYT1QSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSVVQGHFCKPFASLVPND 

SHEELPCILDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHIFHLVTMAHIIQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKL1NLPEDY 

SSLENQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
l-lsoleucine, K=Lysine> L^Leucine, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V^Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










RRGNPLHLCKERFKKIQKLWHQHSVTEEIGHAQ 
EANQTLVGIDWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFLRI1PTQQQRQVQLD 

AQAAQQLQYGGAVGTVGRLNITWQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

NKVIHCTVPPGVDSFYLEIFDERAFSMDDRIAWT 

HITIPESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPVVLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DLKAIQDMFPNMDQEVmSVLEAQRGNKDAAIN 

SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVVV 

GIJVDLLSKHDSQHKLSEVITGDLLIIMAQIIVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVELSLLL 

VPMYYIPAGSFSGNPRGTLEDALDAFCQVGQQP 

L]AVALLGMSSL\FFNFAGISVTKELSATTRMVL 

DSLRTWIWALSLALGWEAFHALQELGFLELLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 


3209 


A 


104 


1999 


AKVVSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 

IPGCGVTFEIVSNPEDAQGVEEREALARMAANV 

ENPASADSEAYIEKYLRSVLAVENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVKRLSGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQNNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLFPVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

GAEGNAPAPGAGGQALASDSEEADEVPEWLREG 

EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 

LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 

ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

LTAALAKADRSHKNPENRKSWAS 


3210 


A 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALWS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRWWGFLAATSVTFVGVMGMRSYYYGKF 
MPVGL1AGASLLMAAKVGVRMLMTSD 


3211 


A 


1078 


594 


VGMELPAVNLKVILLGHWLLTTWGCIVFSGSYA 
WANFTILALGVWAVAQRDSIDAISMFLGGLLATI 
FLDIVfflSIFYPRVSLTDTGRFGVGMAILSLLLKPL 
SCCFVYHMYRERGGELLVHTGFLGSSQDRSAYQ 
TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAMDPAEAVLQEK 
ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 



285 



WO 01/57190 PCI7US01/04098 



SEQID 
NO: 


Method 


Predicted 
hpoinninp 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
aad residue of 
peptide 
sequence 


Amino add sequence (A^AIanine OCysteine, D=»Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G^KJlycine, H— Histidine, 
I=Isoleucine, K=Lysine, L»Leudne, M=Methionine, 
N=»Asparagine, P=Proline, Q=Glutamine, R«Arginine, S«Serine, 
"^Threonine, V-Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nudeotide deletion, 
V=possible nudeotide insertion 










AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

\TTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKHQDIETIES 

NWRCGRHSLQRJHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVnTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTDLIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFV^HLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKHQDffiTIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKI WDKNTLECKRILTGHTG S VLCLQ Y 

DERVnTGSSDSTVRVWD.VNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKHQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKI\\T)KNTLECKRILTGHTGSVLCLQY 

DERVHTGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQID 
NO: 


1 Method 


Predicted 
beginning 
nucleotide 
location 
1 corrpsnnndina 

to first amino 
add residue of 
peptide 
sequence 


1 Predicted end 
I nucleotide 

location 

corresponding 

1 tf\ loci 1 AITIfftl/V 

1 iu iasi amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E— Glutamic Acid, F=PheDvlalaninp n=nivrin» H-n;c«/im 0 
Msoleucine, K^Lysine, L^Leucine, M=Methionine, 
N-Asparagine,P=ProHne, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDHLIWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTINARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELES 




A 




[204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVRWDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNWLIEDNGNPVGTRIKTPPTSLRKREG 
EYSKVLAIAQNFV 


3217 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MhTVVQKXDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYAIWSPEWDAENQGSFCNG 

CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 

3219 J j 


A 

* 1 


1 

1623 ( i 


1563 

( 

] 

572 [' 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLPvPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLMQEQCESLGPGLAVLCKNfYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECIILVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYATVP^PFWnAPMnr^QTrrMn 

CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 
miQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 
3ACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVQHCQKHVWKEMHLHAGEHA 

rSAEGWKGCTCTFKDRSKLREHLRSHTQEKVVA " 
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SEQID 

NO: 


Method 


Predicted 

hpfrinnino 

Ubgllllllllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
' corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 

vji u ui in il /\ciUj r a j ucujiumuiiiCj VF^oijduv, n niaiiijinc, 

I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=€lutamine, R~Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=5top codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATERLLRDHMKNHVNHYKCPLCDMTCPL 

PSSLIWHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCS1KSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV - 


3220 


A 


2760 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENWDREQIDFLAARQQFLSLE 

Q ANKG APHS SP ARGTPAGTTPG ASQ APKAFNKP 

HLANGHVVPIKPQVKGVVREENKVRAVPTWAS 

VQWDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGVYRWEYFRLR 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSP1 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYIISVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGWLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEHQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTnGVGVGAGAYILARYALNHPDTVEGLVLINI 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNIITrL^^ 

RRDLNFERGGDITLRCPVMLWGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGY1JDDCTCDVET1DRFNN 

YRLFPRLQKLLESDYFRYYKVNLBLRPCPFWNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLEEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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SfcyiD | Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, ^Phenylalanine, OGIycine, H=Histidine, 
Msoleucine, K=Lysine, ^-Leucine, M=Methionine, 
N-Asparagine P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=VaIine, W=Tryptopban, Y-Tyrosine, 
A-Unknown, *=Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 



3224 TA 



803 



3225 (A 



5054 



TSEEN LtYS WLEGLCVEKRAF Y RLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFOL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRNISRIMDCVGCFKC 

RLWGKLQTQGLGTALKJLFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSRNLLQ 



PUSTISWDRJJAAGESGTRAASPSPSGSRTAGRLP 
SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 
LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 
TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 
IPSYIRDSTVAVVVYDITMLNSFQQTSKWIDDVRT 
ERGSDVIIMLVGNKTDLADBCRQITEEEGEQRAKE 
LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 
QEKSKEGMDDIKLDKPQEPPASEGGCSC 



PEV lKPSLSQFrAASPIGSSPSPPVNGGNNAKRVA 
VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 
GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 
GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 
SSNNGTSPNPimWDKVIVDGSDMEEWPCIASKD 
TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 
GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 
TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 
PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 
QTSREQQSKMENAGVNFWSGREQAQIHNTDGP 
KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 
TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 
QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 
SWDNNNRSTGGSWNFGPQDSNDNKWGEGNKM 
TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 
GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 
EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 
QAVLQTLLSRTDLDPRVLSNTGWGQTQ1KQDTV 
WDffiEVPRPEGKSDKGTEGWESAATQTKNSGG 
WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 
WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 
QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 
WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 
SKGGWEDCKRSPAWNETGRQPNSWNKQHQOO 
QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 
WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 
GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 
NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 
PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 
GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 
SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 
DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 
GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAOVP 
PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPO 
QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 
QRaSQAVRQQQEQQI^RMVSALQQQQOOOOR 
QPGMKHSPSHPVGPKPHLD^MVPNALNVGLPDL 
QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 
FKQWTSMMEGLPSVATQEANMHKNGAIVAPGK 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Lcucine, M=Metbionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNKMWKNmSSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


VPWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLL ALCH VDGRVPFRPS S A VLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAi#FALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHITPLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFWSCSLWNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 


1104 


QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVWGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFL1CFSWNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

WDEAILTIFHPKKKKKRCSEGHSCCSn 


3229 


A 


25 


722 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKXQIEELKGQEVSPKVYFMKQTIGNSCGT 

IGLEHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILrTslNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

GKEQRWEMVMDKXHFKLWRRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VIERDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR | 
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SEQID 
NO: 



3231 



Method 



3232 A 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



2117 



3233 A 3 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



590 



PCT/US01/04098 

Amino acid sequence (A=Alanine C=Cysteme, D°Aspartic Acid 
E=Glutamic Add, F=Pbenylalanine, G=Glycine, H=Histidine 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Areinine S=Serine 
^Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion 
\=possible nucleotide insertion 



Yc V S WMVSSGMPDFLEKLHMA'l LKAKNMEIKV 
^YISAKPLEMSSEAKATSQSSERKNEGSCGPAR 



718 



FYftr ftAUASSFCAPGDPDMSl-'RK WRQSKFRH 

VFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAVTVEASGGGAFLVLPLSKTGRIDKAYPTVCGH 

TGPVLDmwCPHNDEVIASGSEDCTVMVWQIPE 

NGLTSPLTEPVWLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRIIDPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPOR 

GMGSMPKRGLEVSKCEIARFYKLHERKCEPIVM 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

DADPILISLREAYVPSKQRDLKISRRNVLSDSRPA 

MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 

KLEEVMQELRALRALVKEQGDRJCRLEEQLGRM 
ENGDA 



718 



RLKliDDRRGLPLSSPLWTEPPLSC'CLPATYPADM 
GTAGAMQLCWV1LGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYTPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVVWI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKN1NMNNGKQSLSAEKVL 



RLKEDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
G1PGAGVPSSGRDGGTSRDTFQTVPPNSTIMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVWVI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 



4292 



AGjjuuklu VUUSEFPWEGSALGASPLPPICLQSR 
TWLLRAP APAELGELEEVAAGRGDVWEPFLDSP 
GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 
RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 
SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 
KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 
RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 
EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 
VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 
WMEAN QHSLNILGQKVSMHYSDPKPKINEDWL 
CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 
RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 
QALSQGSEPSSENANDTIILRNLNPHSTMDSILGA 
LAPYAVLSSSNVRVDCDKQTQLNRGFAFIQLSTIE 
AAQLLQILQALHPPLTEDGKTINVEFAKGSKRDM 
ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 
TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 
HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 
VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 
QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 
SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 
NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQED 


Method 


Predicted 

oeginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

n it i»l MttiH* 
iiuLieuiiue 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

irc=r , liifornir Arid P=PhpnvIfllnninf fJ=fJlvrinp H=iHi<;tiHinf» 

I=lsoleucine, K=Lysine, Lr=Leudne, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y^Tyrosine, 
X=Un known, *=Stop codon, /=possib!e nucleotide deletion, 
\=possibIe nucleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQDLG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQILPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKWQFHWRNMHAPGMKK1KLD 

TPEEIARWREERRKlsTYPTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVL3NSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RJRRGRWSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKTNEKTRKVTTVKKFFSASSRVGSKKEIQEAKA 

PSPSINRQTSDETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSDEEQSECAQDFYHNVAE 

RMQTRGKWPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQKJRJRALRWVTPQMLCVPV 

NEDIPEVSDMVVKAITDIIEMDSKRVPRDKLACIT 

KCSKHIFNADOTKNEPASADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


449 


VLSVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYT S AKEEFRPEMLQGKK VIVTGA SKGIGREM 
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SEQH 
NO: 


» 1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue o 

peptide 

sequence 


Predicted end 

nucleotide 

location 

correspond^ 
I to last amino 

acid residue o 
f peptide 

sequence 


? » OCysteine,D=As P articAcid, 
^-Glutamic Acid, F=Pnenylalanine, (^Glycine, H«Histidine 
Msoleucine, K=Lysine, ^Leucine, Methionine, 

I N-Asparagine, P-Proline, Q=Glutamine, R-Arginine, S=Serine. 
Threonine, V-VaJine, W=Try P tophan, Y=Tyrosine, ^ 

r A-Unknown, *-Stop codon, ^possible nucleotide deletion 
^possible nucleotide insertion 


3239 " 








AY J^Ai<LM<JAH VVYTARSKETLQKV VSHCLELG 

Mi^iLNHITNTSLNLFHDDIHHVRKSMEVNFLSYV 
VLTVAALPMLKQSNGSIVWSSLAGKVAYPMVA 

LIDTETAMKAVSGIVHMQAAPKEECALEIIKGGA 
LRQEEVYYDSSLWTTLLUCNPCRKILEFLYSTSYN 




A 


213 


422 


tKTMQLEIKVALNFIIFYLYNKLLW/QPLKKK*EA 

HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 
3241 


Ta 

A 


"1255 


1425 


wiib in VNPNLCNPVAPTSGAHSIG*KWPS WLGA 
VAHSCNPSTLVGRGGRITRGOELR 


3242 




161 


547 


1" AOlOKs 1 AK 1 ffj l TOSLEMENLKSG V YPLKEAS 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 
DLITCLEQGKEPWNMKRHAMVDOPPOR 


3243 


A 


50 


241 


i. U ARUkS 1LPA 1 JhusySAf JiLASMS VVPPNRSOT " 
GWPRG VTOFGNKYIQQTKPLTLERTINT . 


3244 


A 


380 


702 


!• V A YLKLPFFSQ V CLFASSEMFFTISRKNMSQKLS 
LLLLVFGLIWGLMLLHYTFQQPRHQSSVKLREOI 
J^LS^YVKALAEENKNTVDVENGASMAGYGK 


3245 " 


A 
A 


37 


1391 


VLMDGRMMKSMRLREEESPGPSHTASCLCGSAP 

CILCSCCP ASRNSWSRLIFTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGHIDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGFWFFKFLBLVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLFILIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

V"NbGLLQASVITLYTMFVTWSALSSIPEOKCNP 
m.PTQLGNETVVAGPEGYETQWWDAPSIVGLIIF 
LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATO 
QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVT 

AbLHVMMTLTNWYKPGETRKMISTWTAVWVKI 
CASWAGLLLYL 


3246 




52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAHOV 
LTFLLLFVITSVASENASTSRGCGLDLLPOYVSLC 
DLDAIWGIVVEAAAGAGALITLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 


3247 i 


A 


3 


515 

: 

( 


HUVCGSGCCCHUJAUjP V ARQKALPRLRG VMS - 
RFLNVLRSWLVMVSIIAMGNTLQSFRDHTFLYEK 
LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDI 
-DvfKTLYHITLWTFLLALGHFLSELFVYGTAAPTr 
3VLAPLMV ASFSILGMLVGLRYLEVEPVSRQKK 




V 1 


S 


>32 I 

I 
S 

C 

E 
L 


iKLCFPCMQSKJYSYMSPNKCSGMRFPLOEENSV 

HHEVKCQGKPLAGIYRKREEKRNAGNA VRSA 

4KSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 

SCDSTNAAIAKQALKKPKGKQAPRKKAOGKT 

>QNRJCLTDFYPVRRSSRKSKAELQSEERKPJDELI 

SGKEEGMKIDLIDGKGRGVIATKQFSRGDFVVE 

HGDLBEITDAKKREALYAQDPSTGCYMYYFOY 

SKTYCVDATRETORLGRLJNHSKCGNCOTXLH 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G1utamic Acid, ^Phenylalanine, OGIycine, H=Histidine, 
I=IsoIeucine, K=Lysioe, L»Leutine, M=Methionine, 
N»Asparagine, P«Proline, Q=Glutamine, R^Arginine, S=Serine, 
l="l nreonine, v^vaiine, w— i ryptopnan, x^iyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










DIDGVPHLILIASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELDSNPFASLVFYWEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 


A 


43 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKXIFVGGIKEDTEEYNUIDYFEKYGKIETIE 

VMEDRQSGKKJIGFAFVTFDDHDTVDKJVVQKY | 

HTINGHNCEVKKALSKQEMQSAGSQRXjRGGGS 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN i 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKXGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKEL3VIFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDHTDPFKLAEESDS3V1KSRCVPDAA 

GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKfflREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIWSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGUCKSC 


3252 


A 


1 


574 


PLGSKTAPALRVMVQAWYMDDAPGDPRQPHRP 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KJRRERNYS WMDU T1CKDKLPN YEEK1KMF YEE 

HLHLDDEIRYILDGSGWDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFTVDEKNYTKj^VIRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRRPQVTLLDPNE 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3254 TA 



968 



173 



439 
"377" 



3257 I A 



1454 



3258 nr 



1558 



PCT/US01/04098 

"Amino acid sequence (A=Alamne C=Cysteine, D=Aspartic Acid 
^Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, ' 
Msoleucine, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginlne, S=Serine, 
^Threonine, V=Vaiine, W=Try P tophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, A=possible nucleotide deletion 
\=possible nucleotide insertion ' 



K Y LLRLLDKTTVSHNTKRFkJb ALP 1 AHHTLGLPV 
GKHIYLSTRIDGSLVIRPYTPVTSDEDQGYVDLVI 
KVYLKGVHPKFPEGGKMSQYLDSLKVGDWEF 
RGPSGLLTYTGKGHFNIQPNKKSPPEPRVAKKLG 
MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 
TEKDIILREDLEELQARYPNRfKLWFTLDHPPKD 
WAYSKGFVTADMEREHLPAPGDDVLVLLCGPPP 
MVQLACHPNLDKLGYSOKMRFTY 



LQbAUBUVlHVLlLLESPAkPVAAVTQVQRRRY 
HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 
SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 
QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 
NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 
YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 
DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 
PQVPVPGSDISETQVERXRGKKRNKEATGKDVE 
SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 
EQLRGPCWDQSSKASAQDAGDHVQPA 



GSAAMKVKiKC WNG VAT WL W VANDENCGICR 
MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 
ILKWLHAQQVQQHCPMCRQEWKFKE 
1AARRRQKGTAARRRQKGTLEEVVLPPRSCRVF" 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAOTA 
GNVFLKHGSELRIIPRDRVGSC 



UCSAAAAGAUSGPWAAQEKgFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMWRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAFLYNDQLIWSGLEO 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

DTYEELHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSIV GPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFrYTNHMNLAEKSTVHMRKTPSVSLTSVHPD 

LMKILGDINSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRRELYVTLNQKNANLIEVNEEVKKLCATQF 



APR&CSMPHRKKKPFIEKKXAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELIPSSTFSAHNRREEK 

EETLVDPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLOEVL 

>TOYYKEKAENCVKLNTLEPLEDQDLPMNELDES 

EEEEMITVVLEEAKEKWDCESICSTYSNLYNHPO 

LKYQPKPKQIRISSKTGIPLNVLPKKGLTAKOTE 

RIQMINGSDLPKVSTQPRSKNESKEDKRARKOAI 

KEERKERRVEKKANKLAFKLEKRRQEKELLMLK 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutnmic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leutine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLHLATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMNIQTQNKVITY1ACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMWRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGELYSySKUSSIHAISSAQGKYKAFSrC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTVVTPMLNPFIYS 

MKGQFFKKCP 


3260 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEDCRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGELSPSELRKIFSNLE 

DILQLHIGLhreQMKAVRKRI^ETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIEPTQMQRL 

TKYPLLLDNIAtYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PNVEELRNLDLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDELVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASMLVMDHMIMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GEPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTBCIELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

DLCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIWSMVSSSLLTVRAAIPIIMGANlGTSrTNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEJITQLIVESFHFKNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQINVIVPSTANCTSPSLCWTDGI i 

QNWTMKNVTYKENIAKCQHIFVNFHLPD 

ILLILSLLVLCGCLIMrVKILGSVLKGQVATVIKKT 

INTDFPFPFAWLTGYLAILVGAGMTFIVQSSSVFT 

SALTPLIGIGVITIERAYPLTLGSNIGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPVVFniLVLCLRLLQSRCPR 

VLPKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 



296 



WO 01/57190 
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fSEQir 
NO: 


> 1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue o 

peptide 

sequence 


Predicted enc 

nucleotide 

location 

corresponding 
I to last amino 

acid residue o 
f peptide 

sequence 


- Amino acid sequence (Alanine Cysteine, D=Aspartic Acid 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I-Isoieucine, K=Lysin e> L-Leucine, M-Methionine, 
I N^Asparagine Proline, Q=GIutamine, R-Argimne, S=Serinc 
f J ™™ mne » V-Valine, W=Tryptophan, Y«Tyrosine^ 
^"^nown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 


' 1 


(3262 
T3263 


A 


30 


1377 


SDSKTECTAL 

yyQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELLDLFNHTLSECHVELSQSTBCRVVLF 

^LAMFVVGLVENLLVICVNWRGSGRAGLMN 

LYILNMAIADLGIVLSLPVWMLEVTLDYTWLWG 

SFSCRFTliYFYFVNMYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVRRAMCAGIWVLSAIIPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

^AWAVFVMCWLPYHVTLLLLTLKGTHISLHC 

HLVHLLYFFYDVmCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTOHSI 

IITKGDSQP AAAAPHPEPSLSFQAHHLLPNTSPISP 

XQPLTPS 




3264 


A 


1 


919 


yARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDAKREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHVVIYDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFDCTYEDIKENLESRRFQVVDSRATGR 

FRGTEPEPRDGIEPGHIPGTVNflPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAY^CGKPDWIYDGSWVEWYMRARPEDVISE 




3265 


A 


1 


1398 


AKKSTPRTAPRASATRSAAGTMREIVHIOAGOCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRA1LVDLEPGTMDSVRSGP 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

?j™ VV ^ SESCD CLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTWE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGOLNA 

DLRKLAVNMVPFPRLHFFMPGFAPLTSRGSOOY 

RALTVPELT(^MFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIOELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDL VSEYOQ YQDATADEOGEFEEF.FGFnp a 




J266 j 


A 


265 


oo2 

: 

i 


WWtDARVLGPFHPKEEGHWVMTPSEGARAGTG 

RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 

VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 

LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 

RTDPVPVTIALDSLSWLLLRLPCTTLCQVLHAVS 

3QDSCPGETPPSLFPLMLPLPRSVPr.FT.STT F 


3 


267 | A 


^ : 

8 


2 . i 

02 i 


584 

( 
I 
I 

/ 
F 

N 

Oil A 


\AUAGADGREPASERASRAEPPAVAMGQNDLM 

jTAEDFADQFLRVTKQYLPHVARLCLISITLEDG 

RMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

vGQLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 
LWDLKFLMRNLALGGGLLLLLA F <?Fr!i<r qvtp 

lGVPTMRESSPKQYMQLGGRVLLVLMFMTLLH 
DASFFSIVQNIVGTALMILVAIGFKTKLAALTLV 
r WLFAINVYFNAFWTIPVYKPMHDFLKYDFFOT 
1SVIGGLLLVVALGPGGVSMDEKKKF.W 

SH-CSAWKRRS1AALWWSGSRASRSHPRBI nx> 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 

1U luM UU11I1U 

acid residue of 

peptide 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, GMJIycine, H=ffistidine, 
I=Isoleucine, K=Lysine, L^Leurine, M-Methionine, 
N^Asparagine, P»=Protine, Q=Glutaraine, R«Argiuine, S=Serine, 

X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










LCFVFGTAALS1RSMDVLSLFLEHGKLVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

iVTITNAGPLHPYWPQHLRLDN^ 

GLFSVTGVLVYTTwLLSGRAAVWLGTW 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

C1T lin7\7T A PT T~% /""\T TT1T T> T7TT /~\T "t 7\ / CI 7/"* /~\T'\./'/~'TAA 7T \7T^ 

SLWVVIAFLRQHPLRFILQLVV SVGQ1YGDVLYF 
LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 
LPG VL VLDA VK-rtL I HACjS X LJJAKA I KAKoKKN 


3270 


A 


17 


229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYSRKQVSDEIKXSRRTVESNPIFFKKNKKI 
Q 


3271 


A 


419 


553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272 


A 


1211 


1450 


FQFIQffiLLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFniXCFGLSIFPSVIFFLHVYFILTLWF 
YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTIFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEEEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEYINLMKNKL 

DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAVVMGFEDPMLQTD 

DTPIKRCLQTKWPYDBLLWTTDRSPSLN 


3274 


A 


186 


1358 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPrVPL 

DRETQAQPPIXjDHSPGNHEQSYV^ 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TnT pi nCTT ACn "WT A T> TIT T TTT T CT /^T Y17/~1/"\T> OTX7V 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 
PWLLAGWDVTSLSLLSDRKGLTRRERRELRRR 
TBLLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VuLV IKrLJVLU iLr 1 WyisJYr Y o WO 


3275 


A 


575 


759 


SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 




A 


/ 


2jo 


V A A/^ITDT T T A A /T.lJ"DCK/rDOO'ri/" v r T \1/T?7^ CT T7T DDT 

QmSSLLVLVSTTCLFAFPRVPIAFESKSCLIYHCH 
CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 
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SEQHT 
NO: 



Method 



"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



rredicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



PCT/US01/04098 



Amino acid sequence (A=A lamne OCvsteme l^Asno.-^ a - 

X-Unknown, *=Stop codon, /^possible nucleotide deletion 
V-possibie nucleotide insertion ucie ™« deletion, 



3278 TA" 



876 



3279 ~A 82 



2929 



LPEA VQKELL AE WKRTns; mminijrp 




?^4^t GGFQNIACLRSNAM KHLPNFFY 



tlcepgsgqirysmpeeldkgsfvgI^Lgle 
pqelaergvrivsrgrtqijalnpr^tagIi 




S==5 

svdnyyhllttrdldreets D YNITLTVMD HOT 
PPLSlESHIPLKVADVM3WP^OASYW7^r 

^rgvsifsvtahdpdsgdnar^sla^dS 

APLSSYVSINSDTGVLYALRSFD^^oTw? 

^c^ VIVLLVL ^^™SRLLQAEGSPrA G 
VPASHFVGVDGVRAFLQTYSHEVSLTADS^K^H 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Add, F«Pbenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, I/=Leucine, [^Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y«=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GKKEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSWAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRVI 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RIHTETGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVCVLGISAIIVAQWDRFATPKHRQT 

RAGVFLGLGLSGWPTMHFTLAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQIFHVLVVAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSlVtNQEKLA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKLAVNNIAGIEEVNMKDDGTVIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYBCIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AVWKTFYVTEKAWNYYPYTITEYTCSFLPKFSIH 

IETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVVR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHEQTNIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


IKSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAWGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A 


123 


1535 

• 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

KKGHTRNKSTSSAMSGSHQDLSVIQPIVKDCKEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEA\^NNTLSIEPVGLQPIRFV 

KASAVECGGPKKCALTGQSKSCKHRIKLGDSSN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKSIAS 
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SEQID 
NO: 



3287 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
1 sequence 



PCT/US01/04098 



A ~ \W 



A |T 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



390 



1743 



839 



Am.no add sequence <A=Alanine C=Cy S teme, l>=Aspartic Acid, 
E-Glutaniic Acid, F=Pnenylalanme, (^Glycine, BNHisfidine, 
l=IsoIeucine, K=Lysine, IHLeudne, IVfr=Metbioniue, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, Y=Tyrosfi£ 
X-Unbnown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



ADMDFNQLE Ai-'LTAQTKKQcjUl'l SDQAA VISKF 

WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 

SQSRHSAQIHTPVAIIELELGKYGQESEFLCLEFD 

EVKVNQILKTLSEVEESISTLISOPN 

LUAMAKHHP DLlhCRKQAG V A1GRLCEKCDGKC 

VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 

GVSDAYYCKECTIQEKDRDGCPKIVNLGSSK1DL 
FYERKKYGFKKR 



RT1FFRFRPCESLCGDMKLLTHMLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 
EETES 



AGCCRDTRFPTPRUPGSLCHNFCRSAACTVTRTT~ 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTO 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTrllRADTGHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASOLOI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAFRYFSSLHfflDERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 



UKPKSSSDNRNFLRERAGLSSAA VQ'I'RIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

VVAVAVVWWSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPOE 

VTLQLFTDGITNKLIGCYVGNTMEDWLVRTYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDIPSSQILQEEMTWMKEILSNLGSPWLCH 

NDLLCKN1IYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILFIQVNQFALASHFF 

WGL^^^AKYSTEFDFLGYAIVRFNQYFKMK 

PiiAQI-SAVLAREKGHLPtmHEAPMQMASAOD " 

ARYGQKDSSDQNFDYMFKLLIIGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTVFKNEKRIKLOI 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQKTYSWDNAQVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKO 

TFERLVDnCDKMSESLETOPAITAAKQNTRLKET 
PPPPQPNCAC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaoine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Pbenylalanine, G=Glycine, H^Histidiue, 
I=Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=5erine, 
TVThreonine, V=Valine, W«Tryptophan, V=Tyrosine, 
X=l)nknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










GKLPELQGVETELCYNWWTAEALPSAEETKKL 

MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 

LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 

FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 

ESMPEPLNGPINBLGEGRLALEKANQELGLALDS 

WDLDFYTKRFQELQRNPSTVEAFDLAQSNSEHS 

RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 

NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 

QQGLRHVVFTAETHNFPTGVCPFSGATTGTGGRI 

RDVQCTGRGAHVVAGTAGYCFGNLHIPGYNLP 

WEDLSFQYPGNFARPLEVABEASNGASDYGNKF 

GEPVLAGFARSLGLQLPDGQRJREWIKPIMFSGGI 

GSMEADfflSKEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 

NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 

LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 

ALLLRSPNRDFLTHVSARERCPACFVGTITGDRRI 

VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 

VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 

LERVLRLPAVASKRYLTNKVDRSVGGL VAQQQC 

VGPLQTPLADVAVVALSHEELIGAATALGEQPV 

KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 

CSGNWMWAAKLPGEGAALADACEAMVAVMA 

ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 

AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 

QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 

ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 

NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 

DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 

VSVNGAWLEEPVGELRALWEETSFQLDRLQAE 

PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 

GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 

DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 

SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 

CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 

PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 

MEGAVLPVWSAHGEGYVAFSSPELQAQIEARGL 

APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 

DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 

SPWLQLFINARNWTLEGSC 


3293 


A 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHY QMS VTLKYEIKKLIYVHLVIWLLL VAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 

WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 

WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

LRSGKEAKILQHFGDGLCRMLDERLQRHRTSGG 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVTLLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQICLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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SJEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny!a)anine, G=Glycine, HNHistidine 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R^Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELL3<EL 

QRLIWTHTVR^HVGDFVWVAQETNPRDPANP 

GELVLDmVERKRLDDLCSSIIDGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VIDGFFVKRTADIKESAAYLALLTRGLQRLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQRNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPD V WSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSWIEQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAHDVLTKRSNTQR 

QQIAKSFKA QFGKDLTETLKSELSGKFERLI VAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

KmGTOEAlKFITILCTRSATHLLRVFEEYEJOANK 

SIEDSIKSETHGSLEEAMLTWKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELLGTDS AEPEMD VRKRTGVAG S 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

Al^LKTNILAAQSVIKKDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAMLnVCFIFIS 

M1LFIRIMPKLK 


3297 


A 


46 


01 / 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TGIPGSPACRQPWGLHSL1WYRMAMVSAMSW 

VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 

TCEVIAAHRCCmMUEERSQTVKCSCLPGKVAG 

TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRJHPRT 


"3298 


A 


i J / 


HA 0 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD " 

PLNKSSYKYEADTVDLNWCVISDMEVIELNKCT 

SGQSFEmKPPSFDGWEFNASLPRRRDPSLEEIQ 

Kia.EAAEERl^YQEAELLl<^AEl^HEREVIQ 

KAIEE>M^IKMAKEiaAQKM^ 

AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVX~ 

GGAPRl^TPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAGVKTLLPVPSFEDVSIPEKPKLRFIERAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFA1XALG 

GGYLHWG1TEMA41^TINRSMDPK^MFAIWRVP 

APF1<PIT1^SVGHRMGGGKGA1I)HYVTPVKAGR 

LVVEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKJDQEERERIWQiNPWTra 

GDIKVLSPYDLTHKGKYWGKFYMPKRV 


3300 


A 


2 


1847 ] 
< 

| 1 


FVAGGPRGSGSAAETMPEIR V TPLGAGQDVGRS 
:1LVS1AGIWVMLDCGMHMGFNDDRRFPDFSYI 
rQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 
i^GPIYMTHPTQAICPlLLEDYRKIAVDKKGEAN 
^FTSQMKDCMKKWAVHLHQTVQVDDELEIl^ 
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seq n> 

NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nnrlpntiHp 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

F=nintamir Arid Pi=Phpnvlfllflnine fl*=nivrinp UsTtfcfiriinp 
I=Isoleucine, K^Lysine, L=Leurine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamlne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 










YYAGHVLGAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWIDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKXVHETVERGGKVLEPVFALGRAQELCILL 

ETFWERMNLKWIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQRM4FEFKHIKAFDRAFAI>NPGPM 

VWATPGMLHAGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSPVGISLGLLiCREMAQGLLPEAKKPRLLHGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRJFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184 


LRRNCSALGGLFQTHSDMKGSYPVWEDFINKAG 

KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSrEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDtiAKEYKKARQEI 

KKKSSDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALIEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 


3303 


A 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHWSFFQKMDRNKDGVVTIEEFIESCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVTPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 


3305 


A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVVVVLKW 
GMTLFLLYFPQIFNKSNDGFITTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A 


2 


872 


TLSSACLIGDAWKELTTVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRISGHVGIIFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGIRL 

WHLVGRGYRGLG/DLGSLLQVP* * ARYTQGCDS 

GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 

FPVLLPAGGSSWSRGLRIVCYGQWGRSCQGCPH 

QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 



304 



WO 01/57190 



PCT7US01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=Glycine, H=BBstidine, 
I=Isoleucine, K=Lysine, IHLeucine, M=Methionine, 
N=Asparagine, P*Proline, Q=Glutamine, R~Arginihe, S=Serine, 
•^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 










VRGAICQERRLLTSAEKAIKNKNPPSSKPNRSSSNF 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 


3308 


A 


400 


1U/ / 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDHFHPWKKEENGNQSRVIPyTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFEEKNFWSGLEDYFRHL 


3309 


A 


400 




NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDnFHPWKKEENGNQSRVIPYTlTLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3310 


A 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 

AQMAALQAKALAETGIAVPSYYNPAAVNPMKF 

AEQEKKRKMLWQGKKEGDKSQSAGNMGKN 


3311 


A 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 


A 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS" 
P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 
AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSP\S 
AoAFCKA vPLSPRRLTwPPHLQVGILIPTGRPWK 
NL 


3313 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
lAPxCTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP* *T 
PRCPAALRAGAHIGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 
KSNSMLOKPTVAYVRPMDGOP^MFPK'T qqfhvcc 

QSHGNSMTELKPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=G1utamic Acid, F=Phenyl alanine, G-Glycine, H^Histidine, 
IHUoleucine, K B Lysine, L^Leucine, M=Methionioe 1 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaJine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRXAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKXNWEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLDSSKPRRTKLVFDDR2s[YSADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAWSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

NXTRDIKTAAKELLKKVKFIPGSALNGMVEMMD 

RRPYWCISRQRVWGVPIPWHHKTKDEYLINSQT 

TEHIVKLVEQHGSDIWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDELDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 

SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF 

GKLSGWVSVPWLLIFSMILFIFLLGYAWFSSHTSP 

LYWDCLLMRGHEITEQPMKAEVRAGSIMVKEAIF 

LFRKGHSKGKLFLLFFLPFLQVHKTFPTTDGFHW 

AP 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAP\SFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 
WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 
GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 
MKLISPRDFVDLVLVKJR.YEDGTISSNATHVEHPL 
CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 
HTDLSGYLPQNWDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTS1AKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVWTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLGQTGKSAVCfflQDINDDH 

VEDVT 


3323 


A 


8 


459 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 

TKTYSNEWTLWYRPPDILLGSTDYSTQIDMW*G 

QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 

RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

EAWALCAVETHR 


3324 


A 


1276 


466 


PGSTHASARITIY*L*nLSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F^Phenylalanine, G=Glycine, ENHistidine, 
I-Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possibJe nucleotide deletion, 
V=possibIe nucleotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRPIESEEGDSHRRHKHKKSICR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 

RNLLSIDFDHITRTGKTYDDHRKFTLRJLYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD 

LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYKNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTEQFGKFSVINYDLNQVITTTVMKHTKIF 

SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDANITRYFYEYDADGQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EDLYTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 

FNLSTKLKYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPKPELENSPSI* QMSNSMLHLLCA SLS * 

TE.GIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRFAAVPSVFGKGIKFAIKDGIVTADIIGVA 

NEDSRRLAAILNNAHYLENLHFTIEGRX)THYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 

NLGPCRRKJILQTLMRLAAGFQYSSHKDPSLSAK 

EKHTDYHNEARGPWPGWVG*RTADGSCGRGPD 

GAHHPGPKSSSWRASRLLPGLGGSHHLDAYVGR 

DLECGTPAPLQLEIPPQPRGHPAPIPTGQAGPRDS 

GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

A VrKHRA WRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIfflQRTHTGEKPYECNQCGK 
SFCQKGTLTVHQRTHTGEKPYECNECGKNFYQK 

T TTT T/**\T T/*VTfr T^T TCI _r~*i T" T"""T~n — _r*~i m r j - » _ i ■ j _ . j ■ 

LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 

QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 

GVKLY 


3328 


A 


1 


270 


VTRKLP3FIVDAFTARAFRGSPAADCLLENELDED " 

MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 

IWFTPTTDLOILTSSILPSIL 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYrWKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A | 64 


430 


FWRKE^GLAPAAAVATTTSSSTMRFTSISNSLTST 
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SEQID 


Mettood 


Predicted 

koninnt n rr 

ucgiiiDing 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

mirlpntirlp 
IJUCJCUUUC 

location 
corresponding 
to last amino 
aad residue oi 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

Fj=fJhttnmfr Arid ' F=PhenvIftlfinine G=dlvrinp H=Hi<;tiriinp 
I=Isoleucine, K-Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutnmine, R«Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
A^Unknown, =Mop coaon, /=possiDie nucleotide deletion, 
\=possib!e nucleotide insertion 










AAIGLSFTTSTTTTATFrnSTITI^ 

LLSRGFEM.VPYTSTVSVVTTPVMTYGHLEGLIN 

EGNLELEEKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNVVPR1NTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWnFLPPLTSCPLWAPGTKHKTTLEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQfflSSRRHEIVDPV 


3334 


A 


304 " 


410 


AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVIAKNMYYLTQDDESIISAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYNPFF 


3336 


A 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHHKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAVVALLLHYSIDEAE 
CFEKACRILACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKJEQTSRDG 
NVAEYLKLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 


AAAASNWGLITNTVNSIVGVSVLTMPFCFKQCGI 

VLGALLLWCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSNFFARLFGFQVGGTTOMFLLFAVSLCI 

VLPLSLQR>043V[ASIQSFSAMALLFYTVFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIXRTPRKYLAEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARS1GAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 


3342 


A 


385 


2 


NLTWWPLFRDVSFYIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFCYVVFMKFhT/QVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMRNSIFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFnSSERSFKLDSLKCGTWYKV 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanme C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, IHUucine, M=Methionine, 
N^Asparagine, P«Proline, Q=Glutamine, R^Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










KLAAKNSVGSGRISEIIEAKTHGREPSFSKDQHLF 
THINS TH A R T TsJT OGWMMOnPPTT A n/T rvDDvrT 

WAWQGLRANSSGEVFLTELREATWY 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPIEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPPTTSNRHRRQIDRGVTHLNISGLKI^ ~~ 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMTOEPHAIVVDPLR^ 

ffiTAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKLSVIGSIRLNGTDPIYAADSKRGLSHP 

FSroVFEDYIYGVTYmNRVPXIi^ 

GGLSHASDVVLYHQl^QPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

i ut 4 1 UfKC i VCAGYCANNSTCTVNQGNQPQ 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACWInTK 

V&uuv I^N^IJJGKVAPSCLTCVGHCSNGGSCTM 

NSmMPECQCPPHMTGPRCEEHVFSQQQPGHIA 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITG\SHRARPENGFENIF 


1348 


A 


1 


1171 


LSKITMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVHNnVGKLWffiQYGNVEDNH 

KTGDKCVLNFKPCGLFGKELHKVEGY1QDKSKK 

isXfWUv i UJs. w 1 bCL Y b VDPATFDA YKKNDKKNT 

EEKKNSKQMSTSEELDE3VDPVPDSESVFI1PGSVLL 
WRJAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 
iris, i u^i^t^ULKANLbsN ObXDQASEEKKRLEEKQ 
RAARKNRSKSEEDWKmWFHQGPNPYNGAQD 
WIYSGSYWDRNYFNLPDIY ' | 


3349 


A 


403 


497 


NFASSSGKYLRTQIOKCLNNKJvEPl^ 

VRPP*SNRIY*ILQS*NISFS*LPN*NFASSSGKYLR 

TQKIKCLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSr^ESDiRKPARRKIQTTNP 

ur ]^,L,L,L.r ivia v r V V aAFJrr Crr AbGSRDGRPKAS V 

ARPAAVT1EHHSP1UDCG1^PDVIRSSLGGWQPH*P 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

*-rr\jf\ w v ooou^KruL 1 rlrLrA Y oriGCVPSEG 


3351 


A 


1 


428 


MAAVVAATALKGRGARNARVLRGILAGATANK 
ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 
GKNPMKAVGLAWAIGFPCGILLFILTKREVDKDR 

VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 


3352 


A 


2 


841 

1 


RTLFRGRRRREDDRISRPHPSTAESKAPTPKFDLL " 
ASNFPPLPGSSSRMPGEL\^ENRMSDVVKGVYK 
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SEQID 

MA. 
SV\J* 


Method 

• 


Predicted 

hpoi n ni no 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

niirlpotiHp 

UUtlCUUUC 

location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 

Fcf?lntAmir Arid F=PhenvIflInnine G=f»lvcine. H=Hi«:tiriinf* 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLMMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTKPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKJEPSSVLVQPLRELRSNVVSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

IIPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 


3353 


A 


1054 


587 


IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 

PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 

SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 

THLGTIPVPKGKPIAL\^EIR>IRKDVKVFNVTK£ 

NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEWER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLVRWAEELENVRILP 

HTVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQ\O.GIVTPGIVVTPMGSGSNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRVVLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGVVGLSFSGHRJQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKI)DDLFHSYTTIMALIHLGSSK 


3355 


A 


1 


707 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 

SRLLRERIVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSR1MIH 

QPSGGARGQATDIAIQAEEIMKLKKQLYNIYAKH 

TKQSLQVBESAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


3356 


A 


352 


338 


FNYNFCRNLHMPSFLV*PGMCGLLAKHLSFHIVG 
AFLIT/LGVAALCKFAVA*PRKKAYADFYRNYN* 
EKEFE VRKAN1 S Q STK 


3357 


A 


1 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIRIQREHTAVLKIEGWYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGISI 
MVRAQFRThn^PADAIGHRIRMML*PSRMYTTEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

YMDSERQVKDTDDIESPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 

RGSSDGRGSDSESDLPHRKLPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPRTVPV 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLDC 

KEEERKKMEKLLAGEDGTSERRKSKTYREIVQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSLPPPKFTATVETTIARASVLDTSMSAGS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLELKQDNGSEEINIKKPNSV 

PQEl^AATTEKTEPNSQEDKNDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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SEQ fl) 

NO: 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A-Alamne OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H-Histidine, 
I-Isoleucine, K=hysine, L^Leucine, M^Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V^Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion i 










KDQKKPENEMSGKVELVLSQKWKPKSPEPEAT 
LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 
TTPFKFWAWDPEEERRRQEKWQQEQERLLQEE. 
YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 

II\EDPWPFWSSSSADQLSTSSSMTEGSGTMNKI 
DLGNCODEKODRRWKKSFOGDn<;nT T t ktbkq 

DRLEEKGSLTEGALAHSGNPVSKGVHEDHQLDT 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 

KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMnETLNLYFHIOCFRPG\Tri<rnnT nr>A 

VSGTDVRJKNGLLNCNDCYMRSRSAGOPTTL 


3359 


A 


3 


368 


E VTASREGRGAC A WECGSSRGP WGLLRGTF AP V 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 
AWAGRATSM*TSSYSSEYOPOTP*Ai vtt pppcv 

YLLTHLLTLTHLHHQILFEP 


3360 * 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMK\ 
KSPEUSRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDOLISS3VdPCISHr,MTAS AS A T 


3361 


A 


4619 


532 

: 

i 
( 
i 


LLLGRANSPPYNSVVRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL 

MMMVKEKMTTIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTPJRVRKKLIRVEEMKKP\STEGGEE 

HVFENSPVLDERSALYSGVHKKPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 

KKGLGSLSHGRtCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDF VYKEVIKSPTA SRISLGKK VKS VKET 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMG 

LLN>KVGTFTSfFIYVDVLSED\EEKPKRPTRRRRK 

GRP PQPKS VEDLLDRINLKEHMPTFLFNG YEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVLN\KNPsRKPPSFPSCRSC\ETLAEGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPO 

[VPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

krfsepqklttkklegsiaasgrglsppqclprny 

daqppgakhglartpleghrkghefegtefflplg 

rkegvdaeqrmqprtpsqpppvpakksrerlang 

-hpvpmgpsgalpspdapclpvkrgspasptspsd 

:ppalaprplsgqalgsppstrpppwlselpents 

.qehgvklgpaltr\kvscargvdletltenkl\ 
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SEQID 


Method 


Predicted 

honin n inn 
UCgllllJIUg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nil^l An ft/In 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
n^Hjintamic ago, r^irnenyiauiDiDe, ijr=v»iycine, rl = iiisuQine, 
I=Isoleucine, KMLysine, L=Leucine, M=Mettrionine, 
N=Asparngine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion 










HAEGIRSSRE£PYS*LRHGRCGI\P\EALVQRYAED 
LDQPERDVAANlVffiQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSPGCIS\SVSDWLISIGLPMYAGTLSTA 
GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 
RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGG VG Y AHTLHLLPF AG S S WL ARARRTDR WT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

HIKPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQRIESKHRKNPmKM.QMWEEMK 

KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 

IQPHPRTGN*Y\NV\YPTYDFACPIVDSffiGVTHAL 

RTTEYHDRDEQFYWIIEALGIRKPYIWEYSRLNL 

NNTVLSKRKLTWFVNEGLVDGWDDPRFPTVRG 

VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 

WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFINWGNLNITKIHKNADGKIISLDAK 

LNLENKDYKKTTKVTWLAETTHALPIPVICVTYE 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 

VAAQGEVVRKLKAEKSPKAKINEAVECLLSLKA 

QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 

TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 

AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 

KKKEKENKSEKQNKPQKQNDGQRKDPSKNQGG 

GLSSSGAGEGQGPKKQTRLGLEAKKAEENLADW 

YSQVITKSEMffiYHDISGCYILRPWAYAIWEAKD 

FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 

YAKWVQSHRDLPIKLNQWCNVWWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELLAIPWKGRKTEKEKFAGGDYTTTEEAF 

ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 

EKQFAYQNSWGLTTRTIGVMTMVHGDNMGLVL 

PPRVACVQVVIIPCGITNALSEEDKEALIAKCNDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQAILEDIQVTLFTRASEDLKTHMVVANT 

MEDFQKE.DSGKIVQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 

LAAPKETDCVLTQK\LI\ETLKPFGGFLKKEEGTA 

SRRNrW"GKN*INLVKEWIRRNQ*BLAKNLPQSVI\ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A=Alamne C=<jysteine, D-Aspartic Acid, 

E^^OJlntumir Arirl I?r=T*hpnvl alanine f* f 1. t t it- j» 

*^-v*iuutmic aciu, r— rnenyiaiamne, v*=orycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q-Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Ofnknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










ENV\GGKIFT/FLGSYRL/GEVHTKGADIDGVCVF 

APRHVDRSDFFTVSFYDKLKLQEEVKDLRAVEEA 

FWVIKLCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKM.DIRCIRSLNGCRVTDEILHLVPNIDNFRLT 

LRAIKLWAKRHNIYSNH.GFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPnTPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEILLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESKIRILVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFKKTENSENLSVDLTY 

DIQSFTDWYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATK1PTPIVG VKRTS SPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSIKLRLNR 


3364 


A 


54 


3073 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHWREKTGAEEQ/WKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQPiFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGIILQWLQSDPYLSSVS 

HTVHLDEIHERNLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPWEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLIIPLHSLMPTVN 

QTQWKRTPPGVRKIVIATNIAETSITIDDVVYVID 

GGKJKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQL\RSLNALDKQEELTPLGVHLARLPVEP 

HIGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTVVNAFEGWEEA 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKIIKAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

GTEVSPYCLLFFGGDISIQKDNDQETIAVDEWIVF 

QSPARIAHLVKRAVVHMDERREEQIVQLLNSVQ 

A TfMTYl?' CCD A f"MO lire A DCr\T.T/^ , \/r\TrT/-\mnTrT-« 

/iJSJNJJJsJioiiAi^lb Wr AriiDHG YDKKYFFKE 


3365 


A 


439 


878 


ECCNVRPLRETDLLKMKRKPRASSPVVEEQPRA "" 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«=Alanine OCystelne, D=»Aspartic Acid, 
E=Glutamlc Acid, ^Phenylalanine, G=G!ycine, H=Histidine, 
I-Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginlne, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 


3366 


A 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGRIQLREQLPRYLMGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHYFVATQDQNLSVKVKKKPG VPLM 

FnQNTMVLDKPSPKTIAFVKAVESG\RLSQCMRK 

KVSMSKRNRV**KTLNRGRREGCRKKISGPNPLS 

CLKKKKKAPDTQSSASEKKRKRKRIKNRSNPKV 

LSEKQNAEGE 


3367 


A 


40 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMIDLEPTVVD 

EVRAGTYRQLFHPEQLITGKED AANNY ARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAJYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDWPKDVNVAIAAIKTKRTIQFVDWCPT 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF 


3368 


A 


3 


2597 


SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG 

DKTTSFAEQKJiRKLNHTDGESSGSSSQKTTPEGSE 

LNIPHAGAWAQIPEETGLPQGRDTTQLLASEMV 

HLMMK\LKEKR\RAI*AQKKKMEAAFTKQRQKM 

GRTAFLTVVKKKGDGISPLREEAAGAEDEKVYT 

DRAKEKESQKTDGQRSKSLADIKESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSD\ 

SPR\PTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAENDMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETRRKTEEERQKXEDERARREFIR 

QEYMRRXQLKLMEDMDTVIKPRPQVVKQKKQR 

PKSMRDH3ESPKTPIKGPPVSSLSLASLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHIIQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

RDSGCQFRSLYTYCPETEEINKLTGIGPKSITKKM 

BEGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 

WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
£=GIutamic Acid, ^Phenylalanine, G=GIycine, BNHistidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, CHSlutamihe, R-Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, **=Stop codon, /^possible nucleotide deletion, 
\=possib)e nucleotide insertion 










YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTV AEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRIGLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3371 


A 


345 


1383 


DLSLECTGFKETNLG VYFLS SK WVLRL Y ALHIID 

YSAVLFPC*AMDHLESFLAECDRRTEIAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTV AEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 


239 


3348 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 

MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 

ARRSGEVTLTKGDPGSLEEWETWGDDFSLYYD 

SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 

EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 

KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 

GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 

TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 

CMATESVDGELSGCNAAILKRETMRPSSRVALM 

VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 

DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 

QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 

TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 

PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 

LRFHPRQLYLSVKQGELQKVTLMLLDNLDPNFQS 

DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 

VDKQQRTPLMEAVVNNHLEVARYMVQRGGCV 

YSKEEDGSTCLHHAAKIGhJLEMVSLLLSTGQVD 

WAQDSGGWTPnWAAEHKHIEVIRMLLTRGAD 

VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 

HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 

ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAIRTEKECRDVARGYENVPIPCVNGVDG 

EPCPEDYKYISENCETSTMNTORMTHLQHCTCV 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

PLEFECNQACSCWRNCKNRVVQSGIKVRLQLYR 

TAKMGWGVRALQTIPQGTFICEYVGELISDAEAD 

VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 

HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 

ELGFDYGDRFWDDCSKYFTCQCGSEKCKHSAEAI 

ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVG 
FANFVDFNPSGTdA S A GSnnTVirvwn vp \r\Tvr 

LQHYQVHSGGVNCISFHPSGNYLITASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLILPQQQKPWGLCQTRV 
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SEQID 
NU: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
r^Oiutamtc Ada, r=jrDenyiaiamne, li—uiycine, m— msuaine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, QKJIutaraine, R^Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophnn, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KRPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 

KVKA^SIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSILDIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYELFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
GQ VG/WQIT* GQEFETSLD YMVKPHL Y 


3375 


A 


3 


1051 


WTQQI1JVFPEQTNTKDWTVTPEHVLPESQSLLT 

FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 

ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKKJUCKLSTWKQELLKLMDRHK^ 

PFKCQECGKTFRV SS\DL\IKHQRIHTEEKPYKCQ 

QCDKBPRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 

HLIKHRRTHTGEQPYTCSICRKNFSRRSSLLRHQK 

LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCRKGALRQKWHEVKSHKFT 

ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVV 

HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 

TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 

SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTDPKRCFFG 

ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 

ELYAIKILKKDVIVQDDDVDCTLVEKRVLALGG 

RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 

DLMYHIQQLGKFKEPHAAFYAAEIAIGLFFLHNQ 

GnYRDLKLDNVMLDAEGfflKTTDFGMCKENVFP 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 

VLLYEMLAGQPPFDGEDEEELFQAIMEQTVTYP 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 

IRAHGFFPLGFDWERJLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPA\LTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM | 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY j 


3378 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAK^QKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQ VYEDLIQELAKQKCLQ VTA * 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLFhTVXPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 

NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F«Phenylalanine, G=Glycine, H=Histidine, 
I=Iso)eucrae, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q»Glutamine, R=Arginine, S=Serine, 
T-Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










RRNTTNPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VKAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRERAYEMTMRV 

KDKVYHLECFKCAACQBCHFCVGDRYLLINSDIV 

UHV^Dl ihWl KIN OMi 


3381 


A 


945 


474 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
QEEDBCKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQT1AETEATYLKILESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLIFI 

raWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHCICSTG*GRSNNYCRC*KVI*TGTQGR 

RNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 

GNVCKKIRKNKRSSKNNERFDE* ISSS YHVEHP* 

KSL\KSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 


A 


282 


2443 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPG VSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVTACPPTKPLDQVCGTDNQTYASSCH "\ 

LFATKCRLEGTKXGHQLQLDYFG\ASKSIPI\CRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRNKVKKIYL\DEKRLLAGDHPEDLLLRDFK 

KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 

PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 

HCFGIKEEDIDENLLF 


3384 


A 


3166 


928 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLWSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFP(3TYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQD) 
NO: 


Method 


Predicted 

beginning. 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=»Alanine OCysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleutine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T»Threonine, V=Valine, W=Tryptophan, Y=Oyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRPLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDXISREEVNEKLRDTPDGTFLVKDASSKI 

QGEYTLTLRKGGNlSnO.IKVFHRDGHYGFSEPLTF 

CSVVDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LK5RIA\EIHESRT\KL\EQQLLWRASDNKRD/IDK 

PH*TSLKPDLMQLRKERDQYLVWLTQKGARQKK 

INEWLGKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVWDGDTBCHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


3385 


A 


43 


2372 


TRDVNSWKELCFKHYNKETTNCYRTTRKWTNY 

KHFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHfflSHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

♦NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLDDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRJHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKP YDCNECGKAFS QIASLTLHLRSHT 

GEKPYECDKCGiCAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEBCPYECKECRKAFSHK 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLBDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLDHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKOTITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKP YDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNnTHQKIHTRENPLSVHVEKASIRLWTSSDI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


I Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D^Aspartic Acid, 

E=Glutamic Acid. Fs=Phpnvlnlanini» CZ—fll-vnino tj— tUc-ha;*,* 
viHHuun, rt w«> * * iiciijriairtiiiHC, vr — vv lyilnCj xl — XllSTlUine, 

I=Iso1eucine, K=Lysine, L=Leucine, MHVlethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=VaJine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 


3386 


A 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSVWPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSIDTASYKIFVSGKSGVGKT 

ALVAKI^GLEVPWHHETTGIQTTVVFWPAKLQ 

ASSRVVMFRFEFWDCGESALKKFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPVVNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLIEQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKPEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSR WES SNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 
GSHRTSAVRIFS 


3388 


A 


98 


3197 

i- 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 

NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEDLARDGPNALTPPPTTPEW 

VKFCRQLFGGFSELLWIGAILCFLAYGIQAGTEDD 

PSGDNL YLGI VLAA VVIITGCFS YYQE AKS SKIME 

SFKNMVPQQALVIREGEKMQVNAEEVWGDLV 

EDCGGDRVPADLRIISAHGCKVDNSSLTGESEPQT 

RSPDCTHE\NPLKTRNITFFSNNFVEGTARGWVA 

TGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLIT 

GVAWLGVSFFILSL3LGYTWLEAVIFLIGIIVANV 

PEGLLAWTVCLTLTABCRMARKNCLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMWAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNIPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFN^ 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAPDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKA1AKGV 

GIIFEG2snETVEDIAARLNIPVSQ\WRDAKACVm 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLIIV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASIVTGVEEGRLI 

FDNLKKSIAYTLTSNIPEITPFLLFIMANIPLPLGTI 

l illj.ij.lu i jjm V r AlbLAYEAAESDIMKRQPRNPR 

TDKLWERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIIU.NWDDRTVNDLEDSYGQQW 

TYEQRKVVEFTCHTAFFVSIVWQWADLUCKTR 

RNSVFQQGMKNKILIFGLFEETALAAFLSYCPGM 

DVALRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
f^vriutamic aciu, r=jrnenyiuianine, Glycine, a=iiisnuine ) ' 
I=Isoleucine, K=Lysine, Lr=Leucine, M^Methionine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










LRRNPGGWVEKETYY 


3389 


A 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSVVRKEHNS 

KLTTTFPAMVHRTAGQKDSEPLGIEEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLWPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDWLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

WIRLQSHWIWDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMWATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDILVKPKADVKRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMTDLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGLWQYDLTVRDSDGSWQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGRTRPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQUKWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

NIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 

GEVLQKJDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPEDILRFMETRFFKLLMESIKKX 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH* GGPVPSRPPDAAPETHP 

QPGAPGA\EAMERRVQAVREIHPFIDDYQYDTEE 

SLWCQVWKLPLMKINFDMSSLVVSLAHGAVIY 

ATKGITRCLLNETT^^^IKNEKELVL^TOG]M,PELF 

KYAEVLDLRRLYSNDIHAIANTYGffiAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LhniFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLVVGKWRGGTGLFELKQPLR 


3390 


A 


2 


2080 


ELPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ammo acid sequence (A=Alamne C=Cysteine, D=Aspartic Acid, 

E=dluffl m ir Arid Y?=zPhpnvl<tlaninj> Cl=^ Kioino u ur:„<-: j* 

xj MiuuiiuiL /\ciu> r — rneijy una nine, o=vjiycine, rl = Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, MNMetbionine, 
N-Asparagine, P^Proline, Q=Glutamine, R=Arginine J S=Serine, 
•^Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\-possible nucleotide insertion 










ELETLCHQNMARAIETQEGLGIEYDEDVVCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWIPEVSIGCPEKMEPITKISHIPASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEMRTTLADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPS Q AGEDLEK VTLRKQRLQQLEEDF YE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRNLCYMVTRRERTKHAICKLQEQ1FH 

LQMKLEEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVP\GPAASPKPLG 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


NSFLHFLHLKVRTMFLFPSFPVLLLSWTASCSKT 
KACADTQKTCSMITCGIPVTNGTPGRDGRDRPK 
GEKGEPGLGQVS VAS *ISTSGRCSSKS VLEPATRG 
LKHRLGEAPLSSGPMLHSEQPL*NAIASKTKLFV 
DSLGSfflSTQELGVCGCPFRGVSCLVGELALVQA 

lh*vagesfffgsdhwligcaggeqewsiei:lgk 
kkrvtatgssslclatgqglrglqgppgkmgpp 

GNTGTSGIPGPRGQKGDRGDNSVAEAKLANLER 

KL*SLRSELDHTKKL*PFSLGK\MSGKKLFVTNGE 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 

AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKK 

DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 
FPA 


3392 


A 


218 


1773 


GGSRRNQRRSIPVLGYFLKQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

HEQMPTEffiFPESRXPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKRIHTGKKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNIHQRTHTGEK 

PYGCEDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS**TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

Mj'rurMiJL V uDK&QGGRSCQGQITSAASGKTSK 

SEPNHVIFKXISRDKSVTOYLGNRDYUDHV\SQV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTOAEEDKIPKKSSVRL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K<=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyroslne, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMFVDKPLH 

LAVSLNKRDLFPMGSPEPVPVSVPXNNTEKPVKKI 

KA\SVEQVANWLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLVNNTUERRGIALDGK^ 

EDTNLASSTHKEGIDRKRSWEIL VSYPDQR* SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQDANLVF\EEFARP*ILKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


RPPTMAADQRPKADTLALRQRLISSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLVVQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEAIsIDLALRLARHYTGH 

QDVWLDHAYHGHLSSLIDISPYKFRNLDGQKE 

WVHV APLPDTYRGP YREDHP\THVEDGLEKAFS * 

KRWQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLDCDEATRTPATEEAAYLVSRL 

KENYVLLSTDGPGRNILKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDVVCDNCTKIEA 

KGTLNGEKVEHQRTTr^KQLKLGKLPQCLCIHL 

QRLSWSSHGTPLKRHEHVQFNEFLMMDIYKYHL 

LGHKPSQHNPKLNKNPGPTLELQDGPGAPTPGL 

NQPGAPKTQIFMNGACSPSLLPTLSAPMPFPLPV 

VPD YSS S TYLFRLMG SCRPP WETWH SGTLC SFTD 

GPHL 


3396 


A 


109 


107 


TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPF3EALL 

PHVRAFAYTWFNLQARKRKYFXKHEKRMSKEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVILFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHHIGVSVKELDLYLAYFVH 

AADSSQSESPSQAK*R*H*GPARKWDIWGFQ\DS 

FVT\SGVRSVT*A*LRVSQTPI\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

PVITGTQSmnATPSILVHFPRHSPFFQQPGPYFSH 

PAIRYHPQETLKEFVQLVCPDAGQQAGQPNGSS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTRPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 


3397 


A 


1 


2002 


TGTLTEDGLDVMGWPLKGQAFLPLVPEPRRLP 
VGPLLRALATCHALSRLQDTPVGDPMDLKMVES 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue ol 
peptide 
sequence 


Ammo acid sequence (A=Alanine OCysteine, D-Aspartic Acid, " 
^Glutamic Acid, F=PheuyIalanine, G=GIycine, H==Histidine 
l-Isoleucine, K=Lysine, ^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codoa, /=possible nucleotide deletion, 
V-possible nucleotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLO" 
AMEEPPVPVSVLHRFPFSSALQRMSVWAWPGA 
TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 
YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 
EGDLSLLGLL VMRNLLKPQTTPVIQ ALRRTRIRA 

VMVTGDNLQTAVTVARGCGMVAPQEHLIIVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDOAAS 

YTVEPDPRSRHLALSGPTFGnVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASWSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDLQFLAIDLVrrTTVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLILAAAVSKGAPFR\RPLTNNVPF 

LLASAL* SS VLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAG\SKKRFKQLERELAEQPWPPLPAGPLR 


3398 


A 


758 


1368 


FPFRMLTG YL YLMWRRKAF WSGTQRHPLPGGL 

KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 

RPGQGE^GLISPKPVTEVLPDVQGAPVPVPPLPT 

rPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 
SFL 


3399 


A 


906 


1091 


HHHHHHHHJHHHHHLVAFGKVQ*LQNSPSSSSSS 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 
"3401 


A I 


1838 


325 


PFLSVHRSPHUPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSS1PCSAHLTPSSLFPSSLESSSEQ 

KFYNFVILHARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAEDHSAFIILLLTvSN 

\FDCR\LSLHQVNQAMMSNLT\RQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

XTHT '1 TTS T*>T TT1 T /~\ A T"> -rr A ■» m m. tj- . i . L . _ 

M I FKPHRLQ ARXAMWRKEQDTRALREQS QHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQPVLIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 


3402 i 


A 

\ 1 1 


153 

^53 j 


1389 

( 
( 

( 
I 

389 I 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI" 

KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFffiSIQPPSISAPAIADQR>JFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETTVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYIITNVTTLETGISSVNA 

3QDWraTYKTSL*NTNLGDVAKGLQSSNFGVNI 

3TYTPSLTPQTKTGVVNLLTLVE*MWQETYFRME 

^QLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

XSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

3 KLLFRLTVIILTFKCYYVLFHLHNARVLDV 

iWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence <A=Alaoine OCysteine, D^Aspartic Acid, 
£=Hjiutamic Acid, F=r he nylai a nine, G=G!ycine, H=Hisudine, 
Msoleudne, K=Lysine, L^Leucine, M~Metbionine, 
N=Asparagine, P=Proline, Q=Glutnmine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X B Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFIESIQPPSISAPA1ADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

A£KKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

WAGMGNSGITTELTLKYnTNP/TTLETGISSVNA 

GQDVNmTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTG V\NLLTL VE* M WQETYFRME 

NLQLII/CPEDASTKKANVILP VES SKSFQEFYSTS 

CLSPCENhm^KXGVrmSRCTICSKLAEVWIFI 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 


609 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKVVTFHKLKL 

TNMSDKHGFTILNSMHKYQPRFHIVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDFVSPSRG *RATPEAEEQRG STAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

P\FNLNTMRPRLRYSPYSIPVPVPDGSSLLTTALPS 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 


1308 


LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
FIDCTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 
SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAWTRPAPPR 

ACTWGRSSPVTGLAVGAAVAMLTVAARSRPFA 

PVLSATSRGVAGALT\P*MQATVPATPEQPVLDL 

KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTDKVPDFSEYRRLEVLDSTBCSSRESSEARKGFS 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 

LALAKffilKLSDIPEGfemiAFKWRGKPLFVRHRT 

QKEIEQEAAVELSQLRDPQHDLDRVKKPEWVILI 

GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMVTVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKKWKDQNIEYEYQNPRRNFRSVT 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKNITfflSSIQRHMWHSGDGPYK 

CKFCGKAFHWLSLYLIHERTHTGEKPYECKQCG 

KSFSYSATHRIHERTHIGEKPYECQECGKAFHSPR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne C=Cysteine, D=Aspartic Acid, 
x^— vjiuuiuiic alio, r— rnenyiaianine, liiycine, H=Histidine ) 
Msoleucine, K=Lysine, L=Leucine, M-Methionine, 
N^Asparagine, PHProline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V«Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTfflRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTHIRfflSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTHNGEKPY 

ACKKCGKPFGSAQNLR1HERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

NVAKLSLLPVLFNIMKEFTLGRNPISVSNVRKPLF 

LPLLFNIMKGLTWERNPMSVCHVGKPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQSADEGP " 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 

GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 

TILLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 

RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 

ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEKWSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 

DRLLKVLF\LVVGYTVLAGMGLPQVVSGLAIVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGPPGTRLCPRSYTLSLRALLLFKILLSLKSLYOK 
KK 


3408 


A 

♦ 


106 


4514 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHLIELRREFKKJWPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQL1EAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LLELRRK YDEEAA SKADE VGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF - 

KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHN1GQRVTGHYVLGLSQGSVSEILARPKP\ 

WKKLtiLr uKbPFIKMKQFLSDEQNVLALRTIQV 

RQRGSITPRIRTPETGSDDAJKSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSnRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=CIutamic Acid, F=Pnenyiaiamne, c*— Glycine, li— Histidine, 
I-Isoleucine, KHLysine, L=Leudne, M-Methionine, 
N=Asparagine, P^ProIine, Q=Glutamine, R^Arginiae, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=T^rosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYVPRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL , 

WLNDPHNVEKLRDMKKLEBCKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQDCKPRWL 

APEEKEALRKAYQLEPYPSQQTIELLSFQLNLKT 

NWnSTWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRIKQEQMEEDAEEE 

AGSQPQDSGELDKGQGPPKEEHPDPPGNDGLPK 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNLQRRHEKMANLNNIIYRLERAANREE 

ALEWEF 


3409 


A 


162 


1710 


GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKI^FSNKSNLIKrERRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLIKHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRIHSR*KPYE\CKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYLEECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

DKLPDLMPPA\EPLGSALELRASLEEDVAE\RGCE 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQEKEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLTHSAVS WQAGL 
TQPPSVSKDLR\QTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 
ASLTIYGLQTEHEAD* *CRPRRKLPKTARLFFFFL 
IDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMC\^SCFQSPRLQWVWRTAFL 

KHTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 

j nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=PhenyIalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProUne, Q=GIutamine, R«Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










DPRIYKLCLEQLGLQPSESIFLDDLGTNLKEAARL 

GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPTYYIRLANRDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 

LCKJHSWLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 

PPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W*GGRSGRTSWRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDDISNQIDFIICLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAWL/RGSRLKVRWKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEVVIDRGPSSYLSNVDVYLDGffl,ITTVQGD/G* 

GPQHLSWGP*AFLGRE*RLRLSLSGVTVSTPTGST 

AYAAAAGASMIHPNVPAMITPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 


3414 


A 


20 


2602 

] 
] 
< 


VIVNK^JVNWINYIYYNC^ 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLMKAPHAWTLMNTKGHHWLT 

NARLTKYQSLPCENPHIT1EVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAVVTLDAVTKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRJGSHNGPVFVADLDCVEMVHDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFPAQKNHPDOTWVLKASIIRQYYIARVEKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV* AHP 

ESHRDWTAPTGLYWICGHRAYTKLP\ASSCVIGTI 

KPSFFLLSIKTGELLGFPWASR\K5IAIRN*NNDK 

WPPERnQYYGPAT*AQDGSWGYRIPIYMINRirRL 

QAVLKIITATGRALTILAQQETQMRNArYQNRLA 

LDYLLAAEGEVCRKFNLTNCCLHE)NQGQVVED 

fVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

3GFKTLIIRVnVIGTYLLLPRLLPVLL0MIKSFIAT 
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SEQID 
NO- 


Method 


Predicted 

Beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

UULICUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
n^jiuianiic Acta, r— r neoyiaianine, u^oiycine, Jn^riisiiaine, 
I^Isoleudne, K=Lysine, L*=Leutine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X<=Unknown, *«Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










LVYQNASAQVYYINHY 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPBFQG AM VRPKTGG/CGCEGG Y* CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGfflSVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPR\KS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK j 


3418 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFGDPD SEEPITETASPRKTEDSFYNNS YNP 

FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNh^VNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLmYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKXDMSPPFICEETDEQ 

KLQTLDIGSNLEICEKLENSRSLECRSDPESPIBCKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSBQVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSBCASGDENDNIEEDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTXKGNEEKAAITETQRKPS 

EDEVLNKGFKDS\SQYWGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to Jast amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanme OCysteine, D=Aspartic Acid, 

wiuutuiii. auu, r— rnenyiaianine, Glycine, H=Histidine ) 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^ProIine, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=Va)ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 






• 




FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVKPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKJTNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SIO.LEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLICAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPUCKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFW 

GGGDELTNLENDLDTPEQNSKLVDLKLKXLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDSVSQYWGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIED WQKTEAQKRREQLLLDEL V AL VN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 
KMAKKEEKCVLQ 


3420 


A 


612 


1058 


ENLGPNYSHRLLHHPTFYKJO^ 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALnTTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEATLAKKRKVLEFERVYLDNLPSASMYER5 

YMHRDVITHVVCTKTDFIITASHDGHYKFWKKIE 

EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVNFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMIEYWTGPPHE 

YKFPK>TVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRJFRFVTGKLMRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFV^YGTMLGDCVINV^ 

RILGKQENIRVMQLALFQGLAKKHRAATTIEMKA 

SENPVLQNIQADPTIVCTSrTCKNRFYMFTKREPE 

DTXSADSDRDVrTsIEKPSKEEVMAATQAEGPKRV 

SDSAimTSMGDMTKLFPVECPKTVENFCVHSRN 

u x I iNun l r JrixuxivOr Mly 1 UDr 1 GTGMGGESIWG 

GEFEDEFHSTLRHDRPYTLSMANAGSNTOGSQFF 

[TVVPTPWLDNKHTWGRVTKGMEVVQRIShAVK 

VNPKTDKP YED V SIINITVK 


3422 


k : 


2486 


433 ] 
J 


FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ 
DFQLRl^RIffiPNEVTHSGDTGVETDGRMPPKVT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylnlanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N«Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










SELLRQLRQAMRNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAUTEEHAAMWTO 

GRYFLQAAKQMDShTWTLMKMGLKDTPTQEDW 

LVSVU>EGSRVG\nDPLIIPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLKMAERNVMWFVVTALDEI 

AWLFNLRGSDVEHNPVFFSYAIIGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETPKDHRC 

CMPYTPICIAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRJENVVLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQELTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDEERPSTGGLGFSWALRSQNLGKVDEFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGnUVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDVVEVL 

RNAGQWHLTLVRRKTSSSTSPLEPPSDRGTVVE 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMIPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISrV 

GGQTVIKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

VVFIVQSLSSTPRVIPNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alamne OCysteine, D-Aspartic Acid, " 

vriuLdiuiL rtxia, t — rnenyiaianine, U— ulycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T«=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon,A=possibie nucleotide deletion, 
V=possibIe nucleotide insertion 










DAFTDQKJRQRYADLPGELH11ELEKDKNGLGLS 

LAGNKDRSRMSIFVVGINPEGPAAADGRMHIGD 

ELXEINNQILYGRSHQNUSAIIKTAPSKVKLVFIR 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

\DGSLEWGKQLPESESFKLAVSQMKQQKYPTKV 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIVPGQEMIIEISKRRSGLGLSIVGGKDTPLV 

NG VDLRNS SHEEAITALRQTPQKVRL V VYRDE A 

HYRDEENLEIFPVDLQBCKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERVVQLPDFVWDQYTHSLGRVEREFKI^RKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKP\GTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW^[CR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


"3425 


A 


2223 


1162 


HASERVVQLPDFVWDQY1HSLGRVEREFKNRKR 

HTRRVKL VFDKGLPARPKSPLDPKKDGESLS YS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYrTKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKP\GTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRWVHCR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LFVWHDDPRWGTPRYWLGALYRNQQSSPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFKALI 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQVVQWVSFADSDIVPPASTWVFPTLGIM 

ffflNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRAWFGEVKLCEKMAQRDAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVTLDEFKRKYSNEDTLSVALPYFWEHFDKDGW 

SLWSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGKAFNQGKIFK 


3427 ~ 


A 


755 


52 

J 
] 
1 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR " 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

hjusjsjsa^jlu i AAKRaQKGTAARRRQKGTAARRR 

QKGLSNLDAAEWLPPKKG\GEKKKGPFLAINEV 

VT^YPINILKJaHGVGFKKRAPRALKEIRKF^ 

KEMGTPDVRIDTRLNXAVWAKGIRNVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYVPVTTFKlsILOTV 
WDEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Clutamic Acid, ^Phenylalanine, G=Glycine, H^Histidine, 
1= Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«=VaIine, W=Tryptopban, Y=Tyrosine, 
X^Un known, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TNVWVALYKNNVPATYTYDEYKKGYLDQASG 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 


3429 


A 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQ\AQLGQLS 

YLAPGEDGHWVPIPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A 


799 


1989 


INKYINIRXKIKLLSPLPPLWSHLALLQASATKWV 

LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 

TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 

KRGLLLLK\EAG\LEK1NFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 

YLDILAISCDSFDEEVNCP\IGRGN\GBCKNHVENL 

QKL\RRWCRDYRVPFKINSVINPF\NVEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNYWRVSHINSNYKLCPSYP 

QKLLWVWITDKELENVASFRSWKRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKIJJ1JDARSYTAAVANRAK 

GGGCECEEYYPNCEVVFMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQHLSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPCNEREK 

R2vITYmGTCSWALLRAGNKNFHNFLYTPSSD 
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SEQm 
NO: 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid," 
E=Glutamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K«Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=Proiine, Q=Glutamine, R^Arginine, S=Serine, 
T-Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibie nucleotide deletion, 
V=possibIe nucleotide insertion 



MVLHPVCHVRALHLWTAVYLPASSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSG VGGPQQTVGE VGLPPPLPS SO 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEEIXGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDFJLNQDPSGSVASISHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RLRQIEAGYKQEVEQLRRQVRELQMRLDIRHCC 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

DCLSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 
PIATA.SS 



36 



1873 



3433 



1481 



476 



Mlfi-SijVADFIGLDPRIAAWLIDPSDATPSFEDLV" 
EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 
LYRLTMDLCSKLKDYGLWQLFRTLELPLIPILAV 
MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 
VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 
TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 
KIKSTFVDGLLACMKKGSJSSTWNQTGTVTGRLS 
AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 
MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 
FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 
KKWYAWYGAGKERLAACLGVPIQEAAQFLES 
FLQKYKKKDFARAAIAQCHQTGCVVSIMGRRR 
PLPRIHAHDQQLRAQAERQAVNFVVQGSAADLC 
KLAMIHVFTAVAASHTLTARLVAQIHDELLFEVE 
DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 
HLVPLQEAW\ALRQAHVALSLPATAWLPLGPLP 
APSPHPCIFRLHFVCSPRQQWEERTGFQQSIVWPS 
PRSP ALYAPGPJNPLGLGWPAIPWSKCLCKALKK 
K 



IPPKbKAPUlRASCLAITAGARPTSYGRVGCEGDV 
RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 
ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 
KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 
LSTLQVCNWFINARRRLLPDMLRKDGKDPNQFTI 
SRRGAKISETSSVESVMGIKNFMPALEETPFHSFT\ 
AGPNPTLG\RPLSAKP/SQSPGSVLARPSVICHTTV 
TAIERLSLSLSCQSVGCGQNT\DIQQIAT\RNLRDS 
SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 
DFSGFQLLVDVALKRAAEMELQAKLTA 



1720 



1243 



NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ" 
VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 
PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 
IGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 
LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 
RTVQLNVCSSEEVEK V/V GDCPLEPEGP\EKGMW 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, {^Phenylalanine, G=Glycine,H=Histiduie, 
I=Isoleutine, K=Lysine, LpLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutaraine, R=Arginine T S=Serine, 
T=»Threonine, V=Valtne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *«Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 
VRMTKSFLPLIRRAKGRVVNISSMLGRMANPAR 
SPYCITKFGVEAFSDCLRYEMYPLGVKVSVVEPG 
NFIAATSLYSPESIQAIAKKMWEELPEWRKDYG 
KKYFDEKIAKMETYCSSGSTDTSPVTOAVTHALT 
ATTPYTRYHPMDYYWWLRMQMTHLPGAISDM 
IYIR 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEBERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGBCKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPN1QKLLYQRFNTLAGGMEGTPFYQPSPSQ 

DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPPASHPPATSTNKRTNLKKPNSERTGHGLRVR 

FNPLALLLDASLEGEFDLVQRQYEVEDPSKPNDE 

GITPLHNAVCAGHHHIVKFLLDFGVNVNAADSD 

GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 

ffiTAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 

KGVAYALWDYEAQNSDELSFHEGDALTILRRKD 

E 


3436 


A 


3 


2604 


GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMrTVrDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKE.KGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKJMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKOGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDVFQVQKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLSIEVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQID 

I NO: 



PCT/TJS01/04098 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
seq oence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine OCystetne, D=Aspartic Acid, ' 
E-Glutamlc Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Ai B inine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 



PKM PLLHLKAA VKEKKR>JKKKK 11GSPKRIQSPL 
NNKLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 
EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAVEFNDVKTLLREWITTISDPMEEDILQVVKY 
CTDLEEEKDLEKLDLVIKYMKRLMQQSVESVWN 
MAFDFILDNVQWLQQTYGSTLKVT 



4038 



3438 



469 



2602 



SLLRLLKAQ WGSSGAASEPWLGEEGCGFPSTNE 
YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 
DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 
ELAASENTOSPSPRPLRPGVTLPPGALTMNTKDT 
TEVAENSHHLKIFLPKKLLECLPRCPLLPPERLRW 
NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL 
YNRKKVKYRKDGYLWKKRKDGKTTREDHMKL 
KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 
IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 
WSREELLGQLKPMFHGIKWSCGNGTEEFSVEHL 
VQQmDTHPTKPAPRTHACLCSGGLGSGSLTHKC 
SSTKHRHSPKVEPRALTLTSIPHPHPPEPPPLIAPLP 
PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 
SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 
PSMSLAWVGTEPSAPPAPPSPAFDPDRFLNSPOR 
GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 
EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 
KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 
GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 
PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 
AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 
LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 
DWLSLDDNQFRMSILERLEQMEKRMAEIAAAGO 
VPCQGPDAPPVQDEGQGPGFEARVWLVESMIP 
RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 
IETLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 
MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 
LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 
PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 
PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 
DYEATNSKGPLSSLPALPPASDDGAAPEDADSPO 
AVDVIPVDMrSLAKQIIEATPERIKREDFVGLPEA 
GASMRERTGAVGLSETMSWLASYL\ENVDHFPS 
STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 
IGKLDFALLTL\SD\QEQRELYEAARVIQTAFRKYK 
GRRLKEQQEVAAAVIQRCYRKYKQLTWIALKFA 
LYKKMTQAAJLIQSKFRSYYEQKRFQQSRRAAV 
LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 
KQDQAARK1MRFLRRCRHRMRELKQNQELEGLP 



FGRLLWGTAFXSWKMKAP1PHL1LLYATFTQSLK 

VVTKRGSADGCTDWSIDIKKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDIEDFLLPTREPEILWYKECRTKT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKnRTCFTP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H«Histidlne, 
I»IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=GIutamine, R»Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X™Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possib!e nucleotide insertion 










DLDENRVWESDIVKILKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRIUiASVLLHKRELMYTV 

ELAGGLGAILLLLVCLVTIYKCYKffilMLFYRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGTYI 

EDVARCVDQSKRLIIVMTPNYVVRRGWSIFELET 

RLRNMLVTGEIKVILIECSELRG1MNYQEVEALK 

HTIKLLTVIKWHGPKCNKLNSKFWKRLQYEMPF 

KIUEPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLINGQRPQT 

KSSREQNPDEAHTNSA1LPLLPRETSISSVIW 


3439 


A 


251 


2037 


GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDVVTETEGKQ 

NRAWDAVMVCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGAITMKTSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

LKSLCTKKIFLYKQVFPLNLERATLAIIGLIGLKGS 

ILSGTELQARWVTRWKGLCKRPASQKLMMEAT 

EKEQLIKRGVFKDTSKDKFDYIAYMDDIAACIGT 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYR\LMGPG 

KWDGARNAILTQWDRTLKPLKTRIVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLLICK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


3440 


A 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCffiSVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 

AKHTSALCNACRIASSKTANPVAKRHFVQSAKE 

VANSTANLVKTIKALDGDFSEDNRNKCRIATAPL 

IEAVENLTAFASNPEFVSEPAQISSEGSQAQEPELV 

SAKPMLESSSYLIRTARSLAINPKDPPTWSVLAG 

HSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAKEGGGNPKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKA1AVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGVAL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGIIADLDTTMFATAGTLN 

AENSETFADHRENELKTAKALVEDTKLLVSGAAS 

TPDKLAQAAQSSAATITQLAEWKLGAASLGSD 

DPETQVVLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQLKGAAKVMVTNVTSLLKTVKAVEDE 

ATRGTRALEATIECIKQELTVFQSKDVPEKTSSPE 

ESIRMTKCITMATAKAVAAGNSCRQEDVIATAN 
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SEQ ID 

NO: 



3441 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqoence 



PCT/US01/04098 



Predicted en(T 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1584 



3442 



3443 



160 



822 



3444 



566 



1718 



Amino acid sequence (A=Alanine C-Cystcne, J>=Aspartic Acid ' 
E~GIutamic Acid, F-Phenyialanine, G-GIycine, H=Sdine! 
Msoleucme, K^ysine, ^Leucine, Methionine, ^ ■ 
N-A^paragine, ^Proline, CHJIutamine, R=Arfiinine. S=Serin<> 
Threonine, ^Valine, W^ryptop^Y^S^ ^ 

^nlhT 11 ' COd ° n> ^ P0ssib,c nudeot *e deletion, 

V=possible nucleotide insertion * 



LS^VSDlVlLlACKQASFHPDVSDEVRimRF 
GTECTLGYLDLLEHVLVILQKPTPELKOOL^S 

TCLLG^SIEAAAKiaEQLKPRAKPKQAD™ 

GKVGSIPANAADDGQWSQGLISAAR1^AAATS9 
LCEAANASVQGHASEEKUSSAJ^^^x^LL 

X^ Q 5^ G ^ DDDW ^^VGGIAQIIM 



NSARGGVGVRGARAMA1 V^hKAAALNLSALHS 

p^gfsvaqkpfgat™ 

GDVDIPRAKVVRVCQALMDYKVFEiWPTXVTO 

™ I? NKNLSKGK7I)LLVLF L\MDHQKDVFKI 
PGTL\HKIVS\VK\LMAIQNGRDPNRDAGWCORI 

KKK\LLGQFYKCHPDIFIEHFflD 



gsvyfsypdsngmpvwqllgfvtogiSaifkk 

GLKSGEQSQHPFGAMN™ S VAQIGisv1S»S 
svmVKKRWLE AlMAGGMKVAVSPAVGPGPWG ~ 



SGVGGGGTVRLLLILSGCLVYGTAETDVNwS 
QESQVCEKRASQQFCYTWLIPQWIM^o^ 
VNSSRLVRVTQVEl^EKLKELEQFSI^WFsSr 

^^PKLI^VFLLGlJ^FFCGDLLSRSQn^S 
JS^ GIVASL ^^ S ™KKSPIYVELVGGW 

MSFAVCYKYGPLENERSINLLTWTLOLMGLCFM 

rcpaTtqSlt GLGS DEIYEE 8 



^^o JCC ^ fcSiJS£KTTEK EN LGPRMDPPLG ~ 

MRSIIFANYIARDTRRLGATILDRIHSLQINSa.Sr 

GGQDTFMENWTSQRDNIFRNVEVLIYVFDVESR 
ELEKDMHYYQSCLEAILQNSPDAKicLraS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

_ mm ~t AS J _ 

nucleotide 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A^Alanine OCysteine, D=Aspartic Acid, 
E^Glutamic Acid, ^Phenylalanine, G=Glycine, H = BQstidine, 
I=Isoieucine, K=Lysine, I>Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaioine, R=Arginine, S^Serine, 
•^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X B Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDEFTSN 

TYVMWMSDPSIPSAATLrNIRNARKHFEKLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLYHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFEDIFTSN 

TYVMVVMSDPSIPSAATLINnWARXOT 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHK14D 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMWMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADHPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVWEQSIPESTSTPDLAGEPSAEGADGQAEIHT 

KTPIMKIMKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKFPYPTKAELCYLTVyTKYPEEQLKIW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHWGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREEDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEVVRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQ ID 
NO: 


Method 


Predicted 1 Predictedend 
beginning nucleotide 
nucleotide location 
location corresponding 
corresponding to last amino 
to first amino acid residue oi 
acid residue of peptide 
peptide sequence 
sequence ( 


Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
xv-wuuunic acm, ^-Phenylalanine, G=GIycine, H=Histidine 
I=Isoleucine, K-Lysine, l>Leucine, M=Methionine, 
I N-Asparagin^ P=Proli„e, Q^GIutamine, R=Arginine, S=Serine, 

a unknown, *-Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 


3448 






MLYEEDLQJNLUDKTQMSSQQVKQWFAEKMGEE" 
TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 
- VSENSESWEPRVPEASSEPEDMSSPOAGROT FTn 




A 


2 1324 " 


b VAi^K^KTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNKIDPHAPNEMLYGRIGYIY 

ALLFVNKNFGVEKIPQSHIQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPCIGDNRDLLVHWCHGAPGVIYMLIOAYKVF 

R/EREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

GSAGNAYAFL1LYNLTQDMKYLYRACKFAEWC 

iJ^^^CRTPDTPFSLFEGMAGTIYFL\ADLLFP 


3449 




1 




A 


3 2389 


bKHVTGAARsfiiKAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVORAK 

TQEHTRTCRSVLELVRQAPFLVTLHYAFQTDAKL 

^ILDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

^^o^i5 LGIIYRDLKLENVLLDSE GHIVLTD 

FGLSKEFLTEEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGILLFELLTGASPFTLEGERNTOAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGO\PPPG 

DPRIFQGYSFV APSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGWHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

ESCDLWSLGVILY\MMLSGQAPFQGASGQGGOS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 


3450 








A 


201 1705 

! 

< 

] 

1 
( 

I 
C 

L 

I T 


Mj I EMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

VP^^DPEKKRYFRLLPGHIWCNPLTKESIR 

^AffiSKRLRLLQEEDRRKKIARMGFNASSMLR 

CSQLGFLNVTNYCHLAHELRLSCMERKKVQIRS 

VTDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

CYGnNLQSLKTPTLKVFMHENLYFTNRKV\NSV 

:WASLNHLDSHILLCLMGLAETPGCATLLPASLF 

/NSHPAGIDRPGVMT P^FR TORAWCP a iuct xtt/~. a 

WCFSTGLSRRVLLTOWTGHRQSFGTNSDVLA 
JQFALMAPLLFNGCRSGEBFAIDLRCGNQGKGW 
^™^SAVTSVRILQDEQYLMASDMAGKIK 
WDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 
VAVGQDCYTRIWSLHDARLLRTEPSPYPASKAD 
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SEQID 


Method 


Predicted 

npiflnmnff 
UCgl 11 ULU g 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

ffl 1 1 1 AA #1 #1 0> 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine CXtysteine, D=Aspartic Acid, 
n^VviDcamic aciu, jp — rncnyiuiunine, vr— v* ly c in e, ri~xiisiiQinej 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










IPSVAFS SRLGGSRG APGLLMA VGQDLYCYS YS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCPTFRIDNTTYGCNLQDLQAGTIYNFKIISLDE 

ERTVVLQTDPLPPARFGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSP\0O)IGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGVVDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQ>T^WKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPGVmHDSYMTLSHKGTIKESR 

VLAPWITVETHFKELVPGRLY\QVTCSAVSLGELS 

AQKIVIVAVGRTFPDKVANLEANNNGRMRSLVVS 

WSPPAGDWEQYRILLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSVWTTVSGGISSR 

QVWEGRTWSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKVVQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 

NKHOTIQTKSIPKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIMSPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEAQTEGRTVPAAVTDLRITENSTRHLSFRWTAS 

EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGSNRNTTDSLWFNWSPASGDFDFYELILYTsT 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YNNRKSEGRIVYGLRPGRSYQFNVKTVSGDSWK 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 

PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 

NIMMLVPHKRYLVSIKVQSAGMTSEVVEDSTIT 

MIDRPPPPPPHlIlVNEKD\a.ISK5SINFTWCSWFS 

DTNGAVKYFTVWREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SIRAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 

LFGAIEGVSAGLFLIGMLVAWALLICRQKVSHG 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPIK 

INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 

SCDIALLPENRGKNRYNNILPYDATRVKLSNVDD 

DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 

WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 
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SEQDO 
NO: 



Method 



3452 



3453 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



63 



1073 



A I 2674 



514 



A 1844 



244 



£mno acid sequence (A-Alanine ^Cysteine, D-Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
wsoleucine, K«Lysine, L=Leucine, M=Methionine, 
^paragine, P-Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-Threomne, V=Valine, W-Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion 
V=possible nucleotide insertion 



PADQDSL YYciULlLQMLSES V LPE WTIREFKICGE 

EQLDAHRLIRHFHYTVWPDHGVPETTOSLIOFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIM.DR 

ILQQLDSKDSVDIYGAVVHDLRLHRVHMVOTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 
NPEYHRDPVYSRH 



FFRSSSDNGSPIRQYE/HS'rPAHQGPVMGLEGKSr 
ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 
TAAKSITKKCEKRSSSWKETELVWDTPGIFDTE 
VPNAETSKEIIRCILLTSPGPHALLLVVPLGRYTEE 
EHKATEKILKMFGERARSFMILIFTRKDDLGDTN 
LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 
EQEAQRAQLLGLIQRVVRENKEGCYTNRMYOR 
AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 
IRKLEDKVEQEKRKKQMEKKLAEQEAHYAVRO 
QRARTE VESKDG ILELIMTALQIASFTLLR I .F A ED 



3330 



GP1 1 i^LKKKAKMKDMPLRIH VLLGLAITILVOAV 
DKKVDCPRLCTCEIRPWFTTRSIYMEASTVDCND 
LGLLTFP ARLPANTQILLLQTNNIAKIEYSTDFPV 
NLTGLDLSQNNLSSVTNINGKKMPQLLSVYLEEN 
KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 
LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 
ENPIIRIKDMNFKPLINLRSLVIAGINLTEIPDNAL 
VGLENLESISFYDNRLIKVPHVALQKWNLKFLD 
LNKNPINRIRRGDFSNMLHLKELGINNMPELISID 
SLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKL 
ESLMLNSNALSALYHGTIESLPNLKEISIHSNPIRC 
DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFOGO 
NVRQVHFRDMMEICLPLIAPESFPSNLNVEAGSY 
VSFHCRATA\EPQPEri r WITPSGQKLLPNT\LTDKF 
YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 
SVMIKVDGSFPQDNNGSLNIKIRDIQANSVLVSW 
KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 
KVYNLTHLNPSTEYKICIDIFnYQKNRKKCVNVT 
TKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVIC 
LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 
PLINLWEAGKEKSTSLKVKATVIGLPTTJMR 



ERYLFATYVAi'SATLDIGLQQEKiCKEIYMKIQPP 
FEDLFDTAEEYILLLLLEPWTKMVKSDQIAYKKV 
ELVEETRQLDSTYFRKLQALHKETFSKKAEDTTC 
EIGTGILSLSNVSKRTEYWDNVPAEYKHFKFSDL 
LNNKLEFEHFRQFLETHSSSMDLMCWTDffiQFRR 
ITYRDRNQRKAKSIYKNKYLNKKYFFGPNSPAS 
LYQQNQVMHLSGGWGKILHEQLDAPVLVEIOK 
HVQNPXENVWLPLFLASEQFAARQKKVQMKDI 
AEELLLQKAEKKIGVWKPVESKWISSSCKEAFRK 
ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVO 
KYKDLCHSHCDESVIQKKITTIINCFINSSIPPALOI 
DIPVEQAQKIffiHRKELGPYVFREAQMTFLGVMF 
KFWPQFCEFRKNLTDENIMSVLERRQEYNKOKK 
KLA VL/QNDEKS GKDGKQ YANTS VPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIOE 
ELEK\SCLQACNLSQILRLALQLCL ' 



API AVAMMSFGGADALLGAPFAPLHGGGSLHY " 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVR 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleoli tie 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
-b = Oiuiamic Acid, r=rnenyiaianine, o— orycine, n=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arglnine, S^Serine, 
T»Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ASPSRPRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALDIEIAAYRKLLEGEECR1GFGPIPFSLP 

EGLPKIPSVSTHIKVKSEEKIKVVEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PE\AKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEVAKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPABCEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 

E AKKEE AEDKKK VPTPEKE A PA K VE VKED A KPK 

EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFIPGHASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVP1PPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCDLPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEBCPDS 

KTKMDLGGHVKRKHLSYVGTP 


3457 


A 


2 


4869 


FBLSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MWIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQS WSKQATSALQQEETSEKKS 

RKVVIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIIKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ ! 
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1 SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue o 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding 
' to last amino 

acid residue o 
f peptide 

sequence 


tt^^Vfr^™ C =$«™*> I>=As parti c Acid, 
^-Glutamic Acid, ^Phenylalanine, (^Glycine, HNHistidine, 
Msoleucine, K=Lysine, ^Leucine, M=Methionme, 

; N-Asparagine, ^Proline, Q-Glutamine, R=Arginine, S=Serine, 
^Threonine, V=VaIine,W=Tryptophan,Y=Tyrosine 

t X-Unknown, *-Stop codon, /^possible nucleotide deletion 
V=possibIe nucleotide insertion 


3458 








W YRNG VP LSPSKWVQTL W SGERATLTFSHLNKE 

DEGLYT1RVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYinSWKQPAVDGGSPIL 

GYFIDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 

GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

ARLKS/PPLSTLDW1AVIVTEEEPSEGIVPGPPTDLS 

VTEATRSYVVLSWKPPGQRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTWGDKLDIPKAPGKI 

IPSRNTDTSVWSWEESKDAKELVGYYIEANVA 

GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCDITC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEVWNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCD VTDTDGIASS YLIDEEELKRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRF\WMQAEKLS 

GNAKVNYIFNEKGIFEGPKYKMHIDRKrGIJCEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTVVLVGD 

VFKKLQKEAEFQRQEWIRKQGPHFVEYLSWEVT 

GECWLLKCKVANIKKETHIVWYKDEREISVDE 

KHDFKDGICTLLITEFSKKDAGIYEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKKIALSATDLKIO 

STAEGIQLYSFVTYYVEDLKVNWSHNGSA re Y^n 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

KTGHQKTVDLSGQAYDEAYAEFQRLKQAAIAEK 

NRARVLGGLPDWHQEGKALNLTCNVWGDPPP 

EVSWLKNEKALASDDHCNLKFEAGRTAYFTING 

VSTADSGKYGLWKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 




A 


3963 


827 

: 

j 
i 
i 

F 
ft 
L 
E 


L6KSSSDNNTN'l-LGRNVMSlA'rePLMGAOSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGOS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDOEY 

EEVMILRRPSLQRRAGSRSDVTHHAVTSQLpbvP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALVPAFDPRPGRTMVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTIFYYVQKLLQLSCNGNVKSDKLRRIWEPTY 

riMYREMKDSDKEKENGKMGCWSEHVEOYLG 

rDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

RKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

.QSSDDLNLTKEQPQAKAGNGQNSCGVEDVLOL 

.RILYIVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

visji i WibJiPLALASGALPD WCEQLTSKCPF 

.IPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

LTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

dEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

EFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

'LGGGLKPPGYYVQRSCGLPTAPFPQDSDELERJ 
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SEQXD 
NO: 


Method 


Predicted 

KfKJin nino 

ISCglllUlllg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nnrlfwtttrlp 

U UlilCUUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 

lTsf^liitamtf* Arirt F=Phpnvlnlnnini» Cl=CZ Ivri r»p H=HicriHini» 

*-j 0 1 Ulil 11] IC Aull) F—JTUJiUjrllllUIIIIIC} O^^Jlji-HIC, JT — JllbUUIHC^ 

I-Isoleucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagtne, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V«Valine, W-Tryptophan, Y^Tyrosine, 
X«=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










TKLFHFLG1FLAKCIQDNRLVDLPISKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 

WEDFELVNPHRARFLKEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 

MFDFCMHTGIQKQMEAFRDGFNKVFPMEKLSSF 

SHEEVQMCLCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRFVRVLCGMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTWRKVDATDASYPSVNTCVHY 

LKLPEYSSEEMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQIRRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

PCETQTPVMAQPKEDEEEDDDWAPKPPffiPEEEK 

TLKKDEENXDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 


3461 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKXETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEENNDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSrT)WTVKLWTTBCNNK^ 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNlsnDTEVPTASISVEGNPALmWWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 
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SEQID 
NO: 

3462 


1 Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue oi 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue oi 
peptide 
sequence 


Amino acid sequence (A-Alanine C=Cysteine, D-Aspartic Acid 

£— Glutamic Acid. F=PhenvIfllnninp fl—Clirr'.n* ur— j- ' 
> * * ijcuyiaiaiiine, *jr— Ijriycine, JH— HlSUuine 

I=Isoleucine, K=Lysioe, L=Leudne, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine 
T=Threoni D e,V=Valin e> W=Try P tophan,Y=Tyro S ine ) 
X=Unknown, *=Stop codon, /"possible nucleotide deletion, 
^possible nucleotide insertion 


"3463 


A 


2 


2643 


iAPEFSRSTHASAHASVARVLJKMREUQLKKEOR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 

AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 

AESGARSVSSrVRQWNRKINHFLGDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRIIDI 

VMQRMTIVNLEADMERLIKKREELFLLQEALRR 

KRERLQAESPEEEKGLQELAEEIEVLAAMDYIND 

GITDCQATrVQLEETKEELDSTDTSVVISSCSLAE 

ARLLLDNFLKASIDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 

PTRGSTFPRQSRATETSPLTRRKSYDRGQPJRSTD 

VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSL\SEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

LVTGQEIAALKGHPNNWSKYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGKLTGHIGPVMCLTVTQTASOHDL 

WTGSKDHYVKMFELGECVTGTIGPTHNFEPPH 

YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIO 

QIPNAHKDWVCALAFIPGRPMLLSACRAGVIKV 

WNVDNFTPIGEIKGHDSPINAICTNAKHIFTASSG 

CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 




A 


198 


3146 

] 
Z 
I 
^ 
I 
C 
V 


SOEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 

GVYRAESIHTGLEVAIKMtDKKAMYKAGMVOR 

VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

MCHNGEMNRYLKNRVKPFSEKEARHFMHQIITG 

MLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGL 

ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 

SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV 

LADYEMPTFLSIEAKDLIHQLLRRNPADRLSLSSV 

LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGSLFDKRRLLIGQPLPl^lTVFPKNK 

SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIOD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 

PNRDFQGHPDLQKDTSKNA WTDTKVKKNSD A S 

DNAHSVKQQNTMKYMTALHSKPEIIQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 

LKPIRQKTKKAWSILDSEEVCVELVKEYASQEY 

VKEVLQISSDGNTnTY^PNGG\RGFPLA\DRPPSP 

r\DNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 

"CSPKITYFTRYAKCILMENSPGADFEVWFYDGV 

"CIHKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIK 
^YMDHANEGHRJCI J AT.F9TT<;PBPT?i( r TX> c a dctott 

GRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 

/MHSAASPTQAPILNPSMVTNEGLGLTTTASGTD 

SSNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTS 

3AVWVQFNDGSQLWQAGVSSISYTSPNGQ\TTR 

yGENEKLPDYIKQKLQCLSSILLMFSNPTPNFH 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
lLr=Olutamic Acid, b— rhenylalanme, u=ulycine, H=Hjstidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=»Threonine, V=Valine, W=Tryptophan, Y«Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMASKTIPELLKWIEDGIPKDPFLNPDLMKKNPW 
V\EKGKCTEL 


3465 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGWRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGVVDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTE)SAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPL VMS ANLKAAEEEL VFQKRQLLRVWG SQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPLWTPKMIAALDYDPGDGQMGGQGKGRL 

ALRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 
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SEQID 
NO: 



Method 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end' 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



PCT/US01/04098 

Amino acid sequence (A=Alanine C=€ysteioe, D=Aspartic Acid " 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, ' 
I=Isoleucine, K=Lyslne, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 



3466 



1 



VPAMLKIKMSSQGH 

1111 ~ I MSKPPDLLLRLLRGAPRQJKVCI'LFIlGFKFTFFVSr 



3468 



2175 



MIYWHWGEPKEKGQLYNLPAEEPCPTLTPPTPP 
SHGPTPGNTPFLETSDRTNPNFLFMCSVESAARTH 
PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 
QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 
PVLSDASRIALMWKFGGIYLDTDFIVLKNLRNLT 
NVLGTQSRYVLNGAFLAFERRHEFMALCMRDFV 
DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 
ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 
RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 
AR YCPTTHE/DHENV L VKGPA GHLPN7 J .T .MfiHW 



147 



3209 



MAKVILKQSKQCKNLLTCK V AQ VCPVCGCLHC 
YFWWLSGLESRRPSSPLIDIKPDSFGVLSAKKEPIO 
PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 
TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA 
RDLPPPISHDGSRQDMAHSNPYVKICLLPDOKNS 
KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 
TWDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 
WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 
LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 
NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 
LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLP\SL 
QRGEGEAMLSVALTLFSRSPLEQNIIQPLVLSLLHL 
CGSWNMPPGNSQPRGDFLYHSICTWVQDNYAO 
PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 
WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 
CRVFRRQFGMDYVDILQIHRWDYNTPIEETLEAL 
NDWKAGKARYIGASSMHASQFAQALELOKOH 
GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 
AVBPWSPLARGRLTRPWGETTARLVSDEVGKNL 
YKESDENDAQIAERLTGVSEELGATRAQVALAW 
LLSKPGIAAPIIGTSREEQLDELLNAVDITLKPEQI 



ALPLPLPTLYI'UMSRRKQRKPQQLISDCEGPSASE 
NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 
STDPPVMVnGGQENPNNSSASSEPRPEGHNNPQ 
VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 
FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 
APPPPPPPPPPPGVGSGHLNTPLILEELRVLQQRQI 
HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 
GTGTASSTKPLLPLFSP1KPVQTSKTLASSSSSSSS 
SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK 
PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 
LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 
LEKPGGRHKCRFCAKVFGSDSALQ1HLRSHTGER 
PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVO 
MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 
ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 
ATAPGLPAFNKFV^MKAVEPKNKADENTPPGSE 
GSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 
FKSTGSFPLPLCARALGVASPSETSKLQQLVEBQD 
RQGAVAVTSAASGAPTTSAPAPSSSASSGPNOCV 
ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 
STRGNLRAHFV GHKASPAARAQNSCPICQKKFT 
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SEQID 

NO* 


Method 


Predicted 

oeginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
iiuucQiiue 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
t/=ijiutaiTiic Acid, r B rnenyiaianine, u— Olycine, ri=rusudine, 
I=Isoleucine, K=Lysine, L=Leucine, MNMethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T«=Threonine, V=Valine, W«Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










NAVTLQQHVRMHLGGQBPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERS S SPA S ALTPEGEA TS VTL VEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLRTCWCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664 


NLRPLSFALFLGDPNMANLEESFPRGGTRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCWSSL 

GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKMCGAKLKVGQYLNCIVEKVKGNGGVVSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLWKAQVQ 

KVTPFGLTLNFLTFFTGWDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARLSHLSDSKNVFNPEAFKPGNTHKCRIIDYS 

QMDELALLSLRTSIBEAQYLRYHDIEPGAVVKGT 

VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 

NPEKKYHIGDEVKCRVLLCDPEAKKLMMTLKKT 

LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV 

KFYNNVQGLVPKHELSTEYIPDPERVFYTGQVV 

KVWLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGQLVDVKVLEKTKDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 

PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLA1TSLLLLNQCLEELQGVRSLM 

SNRDSVLIQTLAEMTPGMFLDLWQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKW 

ILNVDLLKLEVHVSLHQ\DLV\NRKARKLRKGSE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASHILDDVPEGTSP1TKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRT1PELSVRPSELEDGHTAL 

NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 

WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTPNEGLTVSFPFGKIGTVSEFHMSDSY 

SETPLEDFVPQKVVRCYILSTADNVLTLSLRSSRT 

NPETKSKVEDPEINSIQDIKEGQLLRGYVGSIQPH 

GWFRLGPSVVGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 

VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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PCT/US01/04098 



Method 



Predicted 
beginning 
1 nuc]eot\de 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



2334 



537 



1226 



Amino acid sequence (A-Alanine Cysteine, D-Aspartic Acid 
E=Glutamic Acid, F=PhenyIalanine, Glycine, H-Histidine, ' 
Msoleucne, K=Lysine, ^Leucine, Methionine, ^ 
N-Asparagine, P-Proline, Q=Glutamine, R-Arginine, S=Serine, 
^Threonine, V^Valine, W=Tryptophan, Y^Tyrosin" 
X-Unknown, *=Stop codon, /possible nucleotide deletion 
\=possible nucleotide insertion 



Y^GkbtAhb 1 N VLPKEKQTKPAEAPRLSlSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHOATI 

^^ MLEKQ ^^ LSRre EALMDPGRQPE 

^^^ KTO ^QE^NVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

ECTQEAGELYNRMLKRFRQEKAVWKYGAFLLR 

RSQAAASHRVLQRALECLPSKEHVDVIAKFAOL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVYff) 

MTIKHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

VLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 
ED 



148 



TAAAPVAPGTMDDATVLRKKGYIVGINLGKGSY 

AKVKSAYSERLKFNVAVKIIARKKTPTDFVERFL 

f^^ ILATVNHGSIIKTYEIFETSD GRIYIIMELG 

VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CmXLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLOSIPYOPK 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLRIOK 

E^VDFPRSK]^TCECKDLrYRMLQ\PDVS\KRLH 

roEILSHSWLQPPKPKUTSSASFKREGEGKYRAE 

^?Z KTGLRPDHRPDHKLG AKTQHRLLVVPEN 
ENRMEDRLAETSRAKDHHTSG A Fvrn< r act 



2272 



IKKUAPQHPTLPLPSLTPSSVHI'GUPKITPSVILFL 

PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD 

GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEOW 
RSRRS YSCQVMQEGSTVEKS V A P a vr g ^ 



DJU'TKHKTYLSSSWAKMAAAEUPVGDGELWOT 
W^PNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 
PPPQLLTRNWFGLGGELFLWDGEDSSFLWRLR 
GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 
QHHVALIGKGLMVLELPKRWGKNSEFEGGKST 
3^ VAER ^^LTLKHAjVWYPSEILbPH 
VVLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

WKAVGSIAHASVAAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

©LIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSYGLTWIHKL 

HKFLG SDEEDKDSLQELSTEQKCF VEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLOLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDOKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 



DKPTRHKTYl,!StiS WAKMAAAEUP VGDGELWOf 

mPNHVVFLRLREGLKNQSPTEAEICPASSSLPSS 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 
QHHVALIGKGLMVLELPKRWGKNSF.FPr,r^Q T 
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SEQID 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
Hr-oiu ramie aciu, r = rnenyiaianinc, ij=v»rycine, ri— nistiaine, 
I=Isoleutine, K=Lysine, L=Leucine, M»Methionine, 
N^Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
"^Threonine, V=VaIine, ^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










VNCSTTPVAERFFTSSTSLTLKPIAAWYPSEILDPH 

VVLLTSDNVIIUYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFW1VPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMhniMKKLLHSFHSELPVLSDSERDMKKEL 

QLD>DQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDE1PLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVKNLSALSDWYSVYTSAIAFTVYMNAVWH 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGIKFFL3DFIFKRCPRLR 

AKYDTPYIIWRSLPTDPQLKERSSAAVSRRLQTTS 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVT\ENYLCFESSKSGSSKRNKVIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETOGQAKAKSGWLKPYYF 

IELMESRKDITNQEELWKMKPRRNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPIKIAAIIASLTFLYTLLREVIHPLA 

TSHQQYFYKIPBLVINKVLPMVSITLLALVYLPGV 

IAAIVQLrlNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALVffiHDVWRMEIYVSLGIVGLAJLAL 

LAVTSIPSVSDSLTWREFHYIQSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPIVVLI 

FKSILFLPCLRKKILKIRHGWEDVTKINKTEICSQL 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAWEAVHRL 

DLILCNKTAYQEVFKPEMSLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
add residue of 
peptide 
| sequence 



3477 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3902 



PCT/US01/04098 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid " 
glutamic Acid, F=Phenylalanine, G=Glyci„e, H=Ms, dine 
Msoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine. 
^Threonine, V=Valine, W=Trypto P han, Y=TyrosineT * 
X-Unknown, *=Stop codon, possible nucleotide deletion, 
V=possible nucleotide insertion 



KKCtiiKKLS>i''iKKRCKDKKLLVNFMYLOSLLO 
PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 
LASEDEEEYESGYAFLPDLLFQMVIICLMCVHSL 
ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 
EEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 
PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 
DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 
AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 
PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTOM 
FQTXRCFRLAPTFSNLLLQPTTNPHTSASHRPCV 
NGDVDKPSEP ASEEGSESEGSESSGRSCRNERSIO 
EKLQVLMAEGLLPAVKVFLDWLRTNPDLnVCA 
QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 
O^LLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 
RRFNFDTDRPLLSTLEESVVRICCIRSFGHF1ARLO 
S^ I ^ Qn ^Q s EQESLLQQAQAQFRMA 
QEEARRNRLMRDMAQLRLQLEVSQLEGSLQOPK 
AQSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 
IPRTVroGLDLLKKEHPGARDGIRYLEAEFKKGN 
RYERCQKEVGKSFERHKLKRQDADAWTLYKILD 
SCKQLT\LAQGAGEEDPSGMVTIITGLPLDNPSVL 
SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 



MTEPRERRCj V S VPPRP£ VGTQA TH WRVEESNFN 
KIFLKKDAELGRSNHLPTWDKPEDASWLPOSCL 
GGDAVATTGEIHEEKAWKTRALEVGQPAORDIR 
RGELWGKEHGADQAIQETLEDLSSLERTLWSES 
SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSO 
ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 
LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 
HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 
LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 
ETKHAELIVVKELDREIHSFFDLVLTAYDNGNPP 
KSGTSLVKVNVLDSNDNSPAFAESSLALEIOEDA 
APGTLLKLTATDPDQGPNGEVEFFLSKHMPPEW 
LDTFSIDAKTGQVILRRPLDYEKNPAYEVDVOAR 
DLGPNPIPAHCKVLKVLDVNDNIPSIHVTWASOP 
SLVSEALPKDSFIALVMADDLDSGNNGLVHCm 
SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 
YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 
EKSRYEVSTREhM.PSLHLmKAHDADLGn^K 
y^Q DSp VAHLVAIDSNTGEVTAQRSLNYEEM 
AGFEFQV1AEDSGQPMLASSVSVWVSLLDANDN 
APE WQP VLSDGKASLS VL VNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 
ANGEPLYSIRSGNEAHLFILNPHTGOLFVNVTNA 
SS^SEWELEIVVEDQGSPPLQTRALLRVMFV^ 
VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 
LALFMSICRTEKKDNRAYNCREAESTYROOPKR 
PQKHIQKADIHLVPVLRGQAGEPCEVGQSHKDV 
DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNOG 
NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 
NLPEPQPATGQPRSRPLKVAGSPTGRLAGDOGSE 
EAPQRPPASSATLRRQRHLNGKVSPEKESGPROI 
LRSLVRLSVAAFAERNPVEELTVDSPPVOOISOLL 
SLLHQGQFQPKPNHRGNKYLAXPGGSRSATPrvm 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
JL c= Glutamic Acid, b Phenylalanine, o— olycine, H~Histidine, 
I=lsoleucifie, K-Lysine, ^Leucine, M=Methionine, 
N=Asparaglne, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=»Threonine, V«VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGS SSS SRCL | 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSELPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKDrVAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKIILSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRRNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 

EAARARRJRRRTDVRITG 


3480 


A 


117 


2226 


RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWEIEELSPK 

WFEDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVEEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKEONKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRJRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDVSn^LNhn^JDGNFDVALDI 
SWVSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVPSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SKiJU) j Method 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



PCT7US01/04098 

Amino acid sequence <A=Alanine (^Cysteine, D=Aspartic Acid" 
E=Glutamic Acid, F-Phenylalanine, Glycine, H=Histidine ' 
Wsoleucine, K=Lysine, L=Leucine, MMMetfiionine, 
N-Asparagine.P^Proline, Q=Glutamine, R=Arginine, ^Serine, 
^Threonine, V=Valine, W=Try P tophan, Y=Tyrosine] 
X-Unknown, *=Stop codon, ^possible nucleotide deletion 
V-possible nucleotide insertion 



1273 



172 



3483 



3686 



HEGTFIENGQ WKNIHKPSRLIQPPGDPRGGREGO ' 

RQEVIFYLIIRRKPLFYLVNVIAPCILIILLAIFVI^ 

LPPDAGEKMGLSIFALLTLTVFLLLLADKVPETSL 

SVPEIKYLMFTMVLVTFSVILSVVVLNLHHRSPH 

THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSGWGRGTDEYFIRKPPSDFLFPKPNRF 

QPELSAPDLRRPmGPNRAVALLPELREWSSISYI 

ARQLQEQEDHDALKEDWQFVAMWDRLFLWTP 
IIFTSVGTLWIFLDATYHLPPPDPFP 



ERWDSGGADAE WALAD\V 1 A V WLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPREL VPKO APCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYLLYCLNPRYRVRWYVGFTVNTARR 

VQQHNGGRKKGGA\GRTSGRGPWEMVLVVHGF 

PSSVAALRFEWAWQHPHASRRLAHVGPRLRGET 

AFAFHLRVLAHMLRAPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 
LLET 



WKP WPC1D1 S WNLQVAARTLR VSSAQCGLVPT - 
NMRVESPVPAARASLTGSCVLGQAMPLRGGAGP 
SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 
LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIO 
DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 
KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 
TSVKASVLRIAASFPANLQLVLVLRPTGFFORTLS 
DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSO 
LTEDLGGTLDYCHSRWLCQRTAIESFALMVKOT 
AQMLQSFGTELAETELPNDVQSF\SSVLCAHTEK 
KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 
NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 
QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 
TDIGNSLAHVEHLLRDLANFQEKSGVFVERARA 
LSLTASSFIGNKHYAVDSIRPKCQELRHLCDOFSA 
EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 
SQPVDKCQSQDGAEAALQEIEKFLETGAENKIOE 
LNAIYKEYESILNQDLMEHVRXVFQKQASMEEV 
FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 
CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 
RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 
ELLCVLEGYAAEMDNPLMAHLLSTGLHNKKDV 
LFGNMEEIYHFHNRIFLRELENYTDCPELVGRCF 
LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFO 
ECQRKLDHKLSLDSYLLKPVQRITKYQLLLKEM 
LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 
AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT 
KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 
GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 
IWYNAREEVYTVQAPTPEIKAAWVNEIRKVLTSO 
LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 
NKKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 
SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 
GPKKLVPOKYTWAPHEKGOPDALRVBRnnw 
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1 SEQID 
NO: 


Method 


Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C«Cysteine, D=Aspartic Acid, 

P* In tfl min A/>iH !?■ — P honvlnlnnina fl— TMvrlnP H— Hiciirtina 

JC/ vjiuuiiijic Acia, i ,== x nenyiaianine, vy^vjiycioc, ti— rusuume, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N»Asparagine, P=ProIine, Q=Glutamine, R»Arginine, S=Serine, 
^Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X«Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNFENNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

KWVNKDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLKERYYSGLIYTYSGLFCVYINPYKKLPIYS 

EEIVEMYKGKKRHEMPPHIYAJTDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKKERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGELTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRJNK 

ALDKTKRQGASFIGILDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFG 

LDLQPCDDLIEkPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGKVDYKADEWLMK>HV1DPLNDNIATLLHQSSD 

KFVSELWKDVDR1IGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYKEQLAKLMATLRNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRVVFQEFRQRYE3LTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVnGFQACCRGYLARKAFABCRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQIILEDQNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLKNKHEAMITDLEERLRR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCER\ASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNIL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNS\FREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHIDSA 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 

EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQE)QI 

NTDLKT.ERSHAQKNENARQQLERQNKELKVKL 
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NO: 



Method 



3486 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1782 



357 



3487 



1173 



3281 



PCT/US01/04098 

«m.no acid sequence (A=Alanine Cysteine, U=A S partic Acid, 
^Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T-Threonine, V=Valine, W-Tryptophan, Y=Tyrosint 
X-Unknown, *=Stop codon, ^possible nucleotide deletion 
^possible nucleotide insertion 



QEMEGTVBCSKYKASITALEAKJAQLliliQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEO 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFWPRRMARKGAGDGSDEEVDGKADGAEAKP 
AE 



CSTOVSKAPLIYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEATIMSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATOIVSCTNKNLSKVP 

GNLFRLKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLKTWK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSOL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRTPSMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAO 

VGERLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

QRLLNETVDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASrVLVLLYLYLTPCPCKCKTTCROKNML 

HQSNAHSSILSPGP ASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAV1AEGILKSTRGK 
SDSDSV NSVFSDTPFVAST 



GDPRETKVFPSRSFARN 1 VGVSJ-LHQSHLFHTVSR 

IYVEDKHKILYCEVPKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTfOLVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAI1KKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMGAPKELKFPNFKDRHSSDERTNA 

QWRQYLKDLTRTERQLIYDFYYLDYLMFNYTT 
PFL 



CDKSGAVPFSriRSPRRPSPRSAGPSLSSVSPRSO 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPOSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKNNDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTC ALMLLNTDLHGHNIGKR MTfYvnPTr; 
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SEQU) 
NO* 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucieuiiuc 

location 

corresponding 

to last amino 

acid residue of 

peptide 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=>Aspartic Acid, 
xj — vzitiuiujic /vcia, r— r nenyiaiflnine, o = oiycine» u— rusDuine. 
I«Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 










NLEGLNDGGDFPREL1JCALYSSIKNEKLQWAIDE 

EELRRFLSELADPNPKVKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHALATRAS\NYSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINWAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQVRTHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFL\ASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAP\YDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

KNPYPTKGEKIMLAIITmTLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEEA 

EDEEWATAGDRLTEFRKGAQSLPGPCAAAREG 

RLERRECGLAAPRFSFNDPSGSEEADFLSAETGSP 

RLTMHYPCLEKPRIWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPWQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 


3489 


A 


718 


2073 


IAAYHKALSYRGHVHANNRGTWIVHFTPPP 

RGILPMNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSnCMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSEINfVTGLDLSDFP 

ALADRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRV17OTQGMVTDQFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRDWRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKJSCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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Predicted 
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nucleotide 
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location 
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to last amino 
acid residue of 
peptide 
sequence 



3492 [A TT 



1321 



2024 



2024 



Amino acid sequence (A=Alanine C=Cysteine, U=Aspartic Acid " 
E=Glutaroic Acid, F=Phenylalanine, G=Glycine, EMBstidine, 
1-Isoleucine, K=Lysine, ^Leucine, M=Metnionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argimne, S=Serine, 
T=Threon.ne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, £=possibIe nucleotide deletion 
\=pos»ble nucleotide insertion 



RFHC1U.CECSFNDLNAKDLH VRGRRHRLQYRKK 

VNPDLPIATEPSSRARKVLEERMRKQRHLAEERL 

EQLRRWHAERRRLEEEPPQDVPPHAPPDWAOPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATI 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEGDKRSSVAPQ.TRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKPTHSLLRRIAQOLPROL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTOPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVTVIRVLRDLCRRV 

PT\ WGALP A WAMELLVEKA VSSAAGPLGPGDA V 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 
RGGEGLV 



FVUDGALSGCKRGRAPRVPSMAGSLPPCVVDCG" 
TGYTKLGYAGNTEPQFIIPSCIAIRESAKVVDOAO 
RRVLRGVDDLDFFIGDEAEDKPTYATKWPIRHGII 
EDWDLMERFMEQWFKYLRAEPEDHYFLMTEP 
PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 
AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 
YVIGSCIKHIPIAGRDrTYFIQQLLREREVGIPPEOS 
LETAKAIKEKYCYICPDIVKEFAKYDVDPRKWTK 
QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 
PDFMESISDWDEVIQNCPIDVRRPLYKNVVLSG 
GSTMFRDFGRRLQRDLKRVVDARLRLSEELSGG\ 
RKPKPVEVQVVTHHMQRYAV\WFGG\SMLASTP 
EFFQ VCHTKKD YEE YGPSICRHNP VFG VMS 



PNo vAj^LrtLfUAAVIPNTN V MFQDALGGRSRGS ' 

REESPAPSRAPASASLWRRLVWEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQ1PQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISOOTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLOVTHYL 

DAGQVKSVKPCLKQLQQCIQUSTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHOEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFOGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVOLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSOOL 

LQDHIEACSIJEHNLITWTDGPPPVQFQAQNGPN 
TSLASLL 



PNGVALLHLPGAAVIPNTNYMl-'QDALGGRSRGS" 

REESPAPSRAPASASLWRRLWVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQOTPY 
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SEQID 
NO: 


Method 


Predicted 

ucgiiiniug 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=>Aspartic Acid, 
Hr^vriutamic aci a, f^jrnenyiaianine, v*=oiycine, H^mstiaine, 
I=Isoleucine, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, P=Proline f Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, possible nucleotide deletion, 
^possible nucleotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLWLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAF1VTNLASVYIREGNRHQEVV\LYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAICRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVBETCMAYQGITQEKINEMRV 

APEQQM1ADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGAN\LNARTSMDE 

MPIDLCEEEEFKVLLLELK\HKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKWRRTQPVGTGPNL\YR 

KEYE/GEEAILWQRSA\AEDQRTSTYNGDIRET\R 

TDQENKDPNPRLEKVPVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APMADTTPNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVLLFSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLFNELRIVVEHIIMKPACPLFVRRLCLQS 

IAFISRLAPTVP 


3496 


A 


3 


2867 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNWIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQBGEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRKNNYKWVAASSKSPRVARRALSPR 

VAAENV CKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGD\RPALAHSGLKPLSG 

ETPLSAYKVKTRTXIIRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 
NO: 



Method 



PCT/US01/04098 



3497 



3498 



3500 



Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



1586 



141 



790 



31 



190 



1586 



185 



2692 



Amino acid sequence (A=Alan.ne O^ysteme, l)=Aspartic Acid, 
^-Glutamic Acid, ^Phenylalanine, G-Glydne, H=Hl s tidlne, 
I-Isoleuane, K=Lysioe, L=ieucine, M=Methionine, 
N=Asparagine ^Proline, Q=Glutamine, R=Argini„e, S=Serine, 
T-Threomne, V=Val,ne, W=Tryptopliaii, Y=Tyrosinel 
A=Unknown, *=Stop codon, /=possible nucleotide deletion 
\-possibIe nucleotide insertion 



ViKHRLCRLPPSRAHLPTKEASSLHAVRTAPTSK 

VKTRYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNRLRPVASGGGJCAQPGSPWWRSKGYR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRPGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTOKR 

HSRRAATSP APGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHE\APSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 
KPLHIKPRL 



A I'ARDLGCAKKiURV VMES'l PSRGLNRVHLOCR 

NLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSL 

AKNAVVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRON 

LRIALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAOLLSOA 

GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RA1AINLSSGVSGAGGTVHQPGFIV\VETOYRL 

YAYTESELQIALIALFSEMLYPFP\NMW\ARVTR\ 

ESVQQAIASGITAQQUHFLRTRAHPVMLKQTPVL 

PPTITDQIRLWELERDRLRFTEGVLYNQFLSOVDF 

ELL\LAHAPKLGVLVFE/NTPAKRLMVVTPAGHS 
DVKRFWKRQKHSS 



RDLGPAALMlASASSFSSSQGVQQPSrVSFSOITR 
SLFLSNGVAANDKLLLSSNRITA1VNASVGSGORI 
LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTVS 
MRQGRTLLNCMAG\MSRSASLCLAYLMKYHSM 
S\LLDAHTWA/TKSRRPnRPNNGFWEQLINYEFK 
LFNNNTVRMINSPVGNIPDIYEKDLRMMTSM 



TAGFLLAPLliMgRLLTPVRRlLQL'IRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVGIL 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTWQRLMERIOLPWD 

SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMWCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPNLASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMIFHTPFCKMVQKSLARLMFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPL\DKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 
WYLERVDEQHRRKYARRPV 



MLF1EVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LIAAPGSITHQDLTEEAALNVTLQLFLEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGOGR 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



3501 A 



PCT/US01/04098 

Amino acid sequence (A=Alanlne OCysteioe, u=Asparnc auu, 
Stamic Acid, F=Phenylalanine. G=Glycine, H-ttshdme, 
I=Isolencine,K=Lysine,L=Uucine,M=Methiomne, 
iSJS^ Q=Glutamine, R-Arginlne, S=Serine, 

T=Threonine, V=VaIine, W-Try ptophnn, Y=Tyrosioe, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 

ARLVGALRBTVVAARALDHTLARQRLUAALHA 

LODFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

OV ADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDITPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWOQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDEFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGOPLVFSVDGLLQKITVRIHGDISSFW1KNPAG 

VSOGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR 

AAPQPSTVVPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSWMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 



RRAHPSHSRLSPYLSVSRDPV FF V 1 VSR1ILTLSA 
PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 
WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 
OLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 
LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 
NTTLFIDQVEAKWVEVKSKRRDMTVFSGLFVGG 
LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 
SOVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 
GGVCLNGGVCSVVDDQAVCDCSRTGFRGKDCS 
QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 
CYDLSQNP1QSSSDEITLSFKTLQRNGLMLHTGKS 
ADYVNLALKNG AVSLVINLG SG AFEALVEPVNG 
KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 
GELTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 
GSPVSNNFMGCLKEWYKNNDVRLELSRLAKQ 
GDPKMKIHGWAFKCENVATLDPITFETPESFISL 
PKWNAKKTGS1SFDFRTTEPNGLILFSHGKPRHQ 
KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 
KIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 
LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 
PTEVWTALLNYGYVGCIRDLFTOGQSKDIRQMA 
EVOSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 
GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 
FMKIQLPWMHTEAEDVSLRFRSQRAYGILMAT 
TSRDSADTLRLELDAGRVKLTVNLDCIRINCNSS 
KGPETLFAGYNLNDNEWHTVRWRRGKSLKLT 
VDDQQAMTGQMAGDHTRLEFHNIETGnTERRY 
LSSVPSNFIGHLQSLTFNGMAY1DLCKNGDIDYC 
ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 
SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 
GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 
M1SRDTSNLHTVKIDTKITTQITAGARNLDLKSDL 
Y1GG V AKETYKSLPKL VHAKEGFQGCL AS VDLN 
rARTJXDLISDGSFSCNGTDSRRGMWKGPSTTvCQ 
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"SEQlD 
NO: 



PCT/US01/04098 



Method 



3502 



3503 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



394 



72 



43 



3358 



3504 A 



1124 



139 



Amino acid sequence (A-Alamne Cysteine, D=Aspartic Acid, 
E^lutamic Acid, F=PhenyiaIanine, G=Glycine, BNHistidine 
I-lsoleucme, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threomne, V-Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion 
\=possible nucleotide insertion 



EDSCSfN^uvCLQQWDGFSCDCSMrSFSGPLCND 

PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGK1GVK 

FNVGTDDIAffiESNAIINDGKYHYVRFTRSGGNA 

TLQVDSWPVEERYPAGRQLTIFNSQATIIIGGKEO 

GQPFQGQLSGLYYNGLKVLNMAAENDANIAIVG 

NVRLVGEVPSSMTIESTATAMQSEMSTSIMETTT 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

SSSTTGMVVGIVAAAALCILILLYAMYKYRNRDE 

GSYHVDESRNYISNSAQSNGAVVKEKQPSSAKSS 
NKNKKNKDKEYYV 



^AHLPFTVTO^PKRKP^ 

RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 

KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 



SGURCiP VR VRSEQLSPSAEQ VSQISQISLGRRPLS 
SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 
CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL 
AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGO 
SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 
LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 
DEWNIARLNTVLDWEEECGISIEGVNTPYLYFG 
MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 
PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 
PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 
HGFNCAESTNFATVRWIDYGKVAKLCTCRKDM 
VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 
TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 
RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 
KSEAAVKLRMTEASSEEESSASRMQVEQNLSDHI 
KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 
ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 
ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 
NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 
VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 
WQTKPPNFAAEQEYNATVARMKPHCAICTLLMP 
YHKPDSShfEENDARWETKLDEWTSEGKTKPLIP 
EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 
VRVHASCYGIPSHEICDGWLCARCKRNAWTAEC 
CLCNLRGGALKQTXNNKWAHVMCAVAVPEVR 
FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 
GACIQCSYGRCPASFHVTCAHAAGVL\MEPDDW 
PYVVNITCFRHKVNPNVKSKACEKVISVGOTVIT 
KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 
TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 
GAKYFGSNIAHMYQVEFEDGSQIAMBCREDIYTL 
DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 
QAQQETYLGFWINSKKSQCNIFLSGTY 



RGfccyi'iJAJii-KKi'ACLGFGERLQEFSRLLRAVHR" 
SRAWTCYLAIRMLMATCCPSPTTTACTGPWORA 
PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 
PVAPLRTRPPLLISLPQDFRQVSSVmVDLLPETH 
RRVRLHKHG SDRPLGF YIRDGMS VR V APOGVLER 
VPGIFISRLVRGGLAESTGLLAVSDETT JEVNGIEV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Aianine 0=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Gr/cine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucioe, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R«=Arginine, S=Serine, 
^Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *«Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion 










AGKTLNQVTDMMVANSHNXLWTVKPANQRNN 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELDDNAYDPDV 

NAKQIWIDKWIMDHICLTFTDNGNGMTSDKLH 

BCMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKDAIVFTKNGESMSVGLLSQTYLNEVIKAEHV 

VWIVAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKKGTRIITWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSILYLKPRMQIILRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANI^GVGVVGII 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEKFRIRQ 

PEMIPRINAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVII 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVILPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMrm^QIKVLQQRILEMNDKYVKKETCH 

HQTPTn A \/TPT T PQTTsjm^ W^Pn'RMV^nYOOAT FP 
^olHl UJ\ V r l^i^JC>oll>vjlvoi-/or J^rxivi v ov^ i v^/^/aj^cjc 

IERLKKQCSALQHVKAECSQCSNNTESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTSSNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 


3506 


A 


2 


2120 


RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKIDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFENLNKHAFPLSNGQALFAFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

IIVVPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQATITRCSQPLVGPNDKRCKEDEKYLQTIMDAN 

AQSHKLIIFDARQNSVADTNKTKGGGYESESAYP 

NAELWLEIHNIHVMRESLRKLKEIVYPSIDEARW 

LSNVDGTHWLEYIRMLLAGAVRIADKIESGKTSV 

V\^CSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

tt \7Trvt7'\x7Tc , i?ntn?T7AT p vr*Rr^NrrYKrii a n A t\u ^ptf 

i Jj V XiisJC- W 1 L>-T OxXlvT AJ-on. V VJXIOIN LsL\ nJ\3JJr\XJr^xr XT 

LQFVDCVWQMTRQFPSAFEFNELFLITILDHLYS 
CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 
DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 
RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 
LQREVATRAVSSSSERGSSPSHFATSVHTLV 


3507 


A 


1 


2169 


GSSIKIRLTVLCAKNLAKKDre 



362 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



6388 



Amino acid sequence (A=AJanine OCysteine, b=Asparfic Acid " 
E-Giutamic Acid, F=Phenylalanine, G=Glycine, EHHistidine. ' 
I=lsoleucine, K=Lysine, IHLeucine, M=Methionine, 
N-Asparagine, P-Proline, Q-Glutamine, R=Arginine, S=Serine, 
T=Threo n me, V=VaIine, W=Tryptophan, Y=Tyro S ine, 
X»Unknown, *=Stop codon, /"possible nucleotide deletion 
\-possible nucleotide insertion 



GSUQCHSTDTVKNTLDPKWNQHYD LYVGKTDSJ 
TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 
TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRJG 
TGGSWDCRGLLENEGTVYEDSGPGRPLSCFME 
EPAPYTDSTGAAAGGGNCRFVESPSQDQRLOAO 
RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEORT 
TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 
ELGPLPPGWEVRSTVSGPJYFVDHNNRTTOFTDP 
RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 
QRYERDLVQKLKVLRHELSLQQPQAGHCRJEVS 
REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 
LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 
YMLQINPDSSINPDHLSYFHFVGPJMGLAVFHGH 
YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 
VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 
GVRNVPVTEENKXEYVRLYVNWRFMRGIEAOFL 
ALQKGFNELIPQHLLKPFDQKELELnGGLDKroL 
NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 
RRARLLQFVTGSTRVPLQGFKALQGSTG\AAGPR 
LFTIHLIDANTDNLRKAHTCFNRIDIPPYESYEKL 
YEKLLTAVEETCGFAVE 



1L YINPADLu WNPPVSS W1EKREIQTERANLTILF ~ 
DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 
CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 
QDQLVDYRAEFSKWWLTEFKTVKFPSOGTIFDY 
YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 
SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 
VGAKLASLDPEAYLVKNVPFNYYTTSAMLOAVL 
EKPLEK1CAGROTGPPGNKKLIYFIDDMNMPEVD 
AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 
QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 
ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 
HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 
ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 
FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 
GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 
MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 
GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 
MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 
VLINDLLASGEIPDLYSDDEVENHSNVRNEVKSO 
GL VDNRENC WKFFIDRIRRQLKVTLCFSP VGNKL 
RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 
LQNTEGIEPTVKQSISBCFMAFVHTSVNQTSOSYLS 
NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 
TERLENGLLKLHSTSAQVDDLKAKLAAOEVELK 
QKNEDADKLIQVVGVETDKVSREKAMADEEEO 
KVAVIMLEVKQKQKDCEEDLAKAEPALTAAOA 
ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 
MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 
FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 
AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 
DLTAAQEKLAAKAKIAHLNENLAKLTARFEKA 
TADKLKCQQEAEVTAVTTSLANRLVGGLASENV 
RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 
KKYRQSLLDRTWRPYLSQLKTPIPVTPAT .DPT RA/r 
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This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 
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